phase-1 · email-scanner · 2026-06-25
Reading mail without touching it
The email scanner must never alter mailbox state. A scanner that marks mail
as read — or worse, moves it — is indistinguishable from the attacks it is
supposed to catch. So every fetch is read-only: BODY.PEEK throughout, on a
read-only SELECT.
The fetch itself is two-phase. Phase one pulls the headers and the
BODYSTRUCTURE — the full MIME tree — with zero body bytes transferred.
Phase two fetches only the text parts the feature extractor actually needs.
Hard caps bound the worst case: any single MIME part over 5MB is skipped and recorded rather than fetched, and a running total stops the fetch entirely past 25MB.
The privacy boundary holds here too: bodies are parsed for structure and indicators. Raw content never leaves the machine.