StackBadger: the tool that outlived its sprint
Sometimes the most durable thing a product sprint produces isn't the product — it's the tooling you built along the way. Extract it deliberately.
- While building TariffRefunded, I built a penetration-testing harness to attack my own stack the way an outsider would — 13 attack-category modules plus an automated scanner.
- When product work wound down, the harness was the only part of the codebase still earning commits — so I generalized it across stacks and published it.
- Publishing security tooling from a real codebase demands a deliberate release process: ship only tracked history, scrub identifying material, verify every credential-shaped string is synthetic, then human review before it goes public.
- Honest engineering includes publishing negative results — the README documents what the tool can't detect, not just what it can.
StackBadger is the only thing on this register with a public repository, and it started as an internal weapon: a harness for attacking my own product before anyone else could.
Born inside a sprint
While building TariffRefunded — a tariff-refund product handling importers' customs data — I wanted security testing that was repeatable, not a one-time review. So I built a black-box harness: it probes a deployed app the way an outside attacker would, with no source access, and writes up what it finds. Thirteen attack-category modules — authentication bypass (forging or tampering with the JSON Web Token that proves who you are), IDOR (reaching another tenant's records by guessing their IDs), row-level-security bypass straight through the REST API, storage path-traversal, webhook signature spoofing, injection, malicious file upload — plus an OWASP ZAP scan orchestrated alongside, its findings deduplicated against the harness's own.
Then the product sprint wound down. And the commit history showed something worth noticing: the harness was the only part of the codebase still getting work. The product had stopped; the tool hadn't. That's a signal worth acting on.
Generalizing it
The harness knew too much about my specific stack to help anyone else, so the extraction was mostly removal: everything product-specific moved into a YAML profile — which login provider, which database, which payment processor — and the tests became generic. I researched the authentication quirks of seven providers and shipped adapters for four of them: Clerk, Firebase Auth, Supabase's GoTrue, and NextAuth. A profile describes a stack; the harness adapts to whatever the profile says.
One result from that work deserves naming because it's negative, and negative results rarely get published. I tried to make the tool auto-detect a Supabase Auth target by scanning an app's shipped JavaScript bundle, and concluded it reliably can't: supabase-js statically bundles the same auth client whether or not you use it, and the request paths that would give it away are built at runtime, never written as literals a scanner could find — so a Supabase-Auth target is indistinguishable from "Clerk plus a Supabase database" from the outside. The tool requires you to declare the stack with a --profile flag instead, and the limitation is documented in the README and pinned by a test rather than papered over. A security tool honest about its blind spots is worth more than one that demos well.
Publishing without leaking
Releasing security tooling extracted from a real product is itself a security exercise. The export followed a written playbook: take only the repository's tracked history via git archive from origin/main (so nothing untracked rides along), scrub every product and brand reference, then decode anything shaped like a credential — every eyJ… string, the standard opening of a JSON Web Token — to confirm it's synthetic test data, and finally a clean git init with a human, not an agent, pushing it public.
read-only probes — no writes to the targetrequire --full + @pytest.mark.write_probe (opt-in, per probe)CONFIRM_TARGET · CONFIRM_AUTHORIZED (doctor.py)publishing a weaponAn attack tool you hand to strangers needs guardrails the original never did: writes are off unless explicitly armed, and it refuses to run until you've affirmed you own — or are authorized in writing to test — the target.
Bet on your own recurring problems
Products are bets on a market; tooling is a bet on problems you know you'll have again — and the second bet pays off more reliably. When a sprint ends, audit what's left with one question: what here is still alive? In my case it was the thing built to attack everything else, which, as a career security person, feels about right.