The email that shipped three times

I shipped the same bug three times in sixteen days. Not three similar bugs — the same one, twice after I'd fixed it and written a rule telling my AI coding agents never to do it again.

The setup

ReadySetBind automates insurance paperwork: a quote PDF goes in, and a bind request comes out the other end — an email to an underwriter that actually puts coverage in force. That email is the entire point of the pipeline. If the system is ever wrong about whether it went out, a deal sits dead while everyone believes it's moving.

The bug

That's exactly what it got wrong. AI coding tools make it feel free to build your own version of everything, so instead of keeping my email code paper-thin over Resend — a service whose whole job is sending mail reliably and reporting what happened — I let the agents build a custom wrapper with its own logic for suppressed addresses, send cutoffs, and duplicate sends.

The catch: Supabase's function-invoke call returns any (TypeScript's "I give up, this could be anything" type), so the compiler couldn't force callers to inspect the result. The wrapper returned a 200 even when nothing was delivered — suppressed, disabled, a replayed duplicate — and every caller read "no error" as "sent." The app marked records sent, advanced the deal, and wrote audit rows for emails that never existed.

Fix, document, repeat

I fixed it on day two. It came back the next day in a new piece of code. So I did what fifteen years of security consulting trains you to do: I documented it — once in the instructions every agent session reads at startup, again in a dedicated rules file.

It came back a third time anyway, in code written with both rules sitting right there.

That's not an AI problem. Human teams work the same way: the engineer who got burned remembers, the new hire doesn't. If written rules stopped recurring mistakes, the security industry's policy binders would have ended phishing a decade ago.

Three ships past the rules, zero past the gate

Ship 1bug ships
Fixedday 2
Ship 2recurs next day
Rule writtentwice, in agent files
Ship 3recurs anyway
ast-grep gate0 after

What actually worked

Stop rebuilding what you can rent. The bug lived entirely in code I never needed to write. Resend already knows whether a message was delivered; my wrapper was a second opinion that was sometimes wrong. I cut it to a thin pass-through built on one rule: report Resend's real status, and treat anything unclear — unknown, null, empty — as a failure, never a success.
Turn the written rule into a gate. A static-analysis check — an ast-grep rule, which matches the structure of the code rather than its text — now runs on every build: if any caller invokes the email function without branching on the delivered status, the build fails.

_shared/email-send-status.ts · classifySendResult()

SHIPPED 3xno invoke error ⇒ treat as delivered

FAIL-CLOSEDunknown | null | "" | error ⇒ "failed", never delivered

the gateA 79-line classifier makes "unclear" mean "not sent," and an ast-grep CI rule fails the build if a caller ignores it. The rule itself had to be hardened after a stray req.status === 200 fooled the first version into thinking the status was checked.

THE COUNTThe bug shipped three times past written rules — and zero times past the automated gate.

The bigger lesson

Security teams have language for this. A policy that says "laptops must be encrypted" and software that refuses to boot an unencrypted disk are different kinds of control: one documents intent, the other makes the bad state impossible. I spent fifteen years writing the first kind for clients and auditing the gap between the two. It took two weeks of my own product to re-teach me which one stops a repeat.

And the AI corollary: when agents make building custom infrastructure feel free, "should I build this at all?" matters more, not less. The fastest code to debug is the code you didn't write.