2026 · SECURITY TOOL

StackBadger

Pentest harness for AI-assisted development — extracted from a production SaaS build, scrubbed, and open-sourced. The only code still earning commits after product work stopped.

PUBLISHED REPO ↗
START HERE StackBadger: the tool that outlived its sprint I built a black-box security harness to attack my own product. The product sprint ended. The tool kept earning commits — so I generalized it and published it. Read the full story →
186COMMITS
39PULL REQUESTS
12BUILD DAYS
SOURCE: GITHUB COMMIT HISTORY · 2026-06-12
Build cadence — commits per active day
May 4 Heaviest day: 51 commits · May 15 Jun 12
StackBadger screenshot
  1. MIT-licensed open source — the harness ships as a public GitHub repo, README and SECURITY policy in the tab bar.
  2. No server-side secrets required: point it at a URL, give it two test accounts, and run one command.
  3. Responsible-use authorization is its own README section, not a footnote — the tool calls itself an active offensive scanner.
  4. Read-only by default; --full mode is required for state-changing writes, and written authorization is a stated precondition.

Overview

A black-box security test harness for AI-built apps: point it at a deployed product and it probes authentication, data access, and payment webhooks the way an outside attacker would, then writes up findings. Thirteen test modules plus an automated scanner, driven by a simple profile describing the target's stack.

Export playbook
Archive tracked files only Scrub brand references Prove credentials synthetic Fresh repository

Project Design

Born inside TariffRefunded as an internal weapon, then generalized: seven login providers researched, four auth adapters shipped — Clerk, Firebase, Supabase/GoTrue, NextAuth — behind one shared abstract base class. Published via a written export playbook: archive only tracked files, scrub every brand reference, decode anything credential-shaped to prove it's synthetic, fresh repository, human pushes.

Layered CLI pentest harness — profile-driven, pluggable adapters, dual reports
CLI / orchestration
run.sh (preflight, sign-in, mode gating)Agent runbook (LAUNCH.md)CONFIRM_TARGET / CONFIRM_AUTHORIZED gates
Discovery / profiles
discover.py — black-box stack fingerprintingprofile.py + profile_assembler.py (YAML + live discovery)
Auth adapters
AbstractAuthAdapter baseClerkFirebaseSupabase GoTrueNextAuth
Attack modules
13 test_*.py — auth bypass, IDORRLS / Firestore / storage misconfigInjection, webhook spoofing, file upload
Reports
pytest JSON + ZAP XML → HTML + agent JSONTwo-layer secret scrubbing

Key modules

Engine

Discovery engine

Fingerprints a target's auth/DB/storage stack from live bundles or source — no provider named by hand.

Auth

Auth adapter factory

One abstract base with four pluggable provider sign-ins (Clerk, Firebase, Supabase GoTrue, NextAuth).

Profiles

Profile loader

Merges live discovery with optional YAML overrides into a runtime profile carrying no pre-authored secrets.

Attacks

Attack suite

Thirteen attack-category modules — auth bypass, IDOR, RLS/storage misconfig, injection, webhook spoofing, file upload.

Reporting

Secret scrubber

Two-layer redaction strips seeded credentials, then Bearer/JWT/cookie patterns, before any report is written.

CLI

Run orchestrator

A bash driver that runs preflight, signs in, gates write probes, scans, and aggregates the reports.

Key features

Profile-driven scanning — the profile is the contract

StackBadger never hardcodes a target. Every probe reads its endpoints, tables, and provider choices from a profile — so one test suite runs against four auth providers (Clerk, Firebase, Supabase GoTrue, NextAuth), two databases, four storage backends, and three payment processors. Tests declare what stack they need with pytest markers, and the harness skips anything the active profile doesn't have. Live discovery fingerprints most of the stack automatically from the running site, and an explicit YAML profile unlocks the deeper, endpoint-specific probes. Adding a new provider is one adapter class plus a registry entry — not a rewrite of the tests.

Target URL + profileDiscovery fingerprints the stackAuth adapter signs inMarker-gated probes runSkip unsupported providersAggregate per-stack findings
Every target detail derives from the profile — no production name is ever hardcoded in a test

Security & ops decisions

Safe-by-default probe gating
Probe run
Read-only by default — no writes happen unless you ask
Write probes require the --full flag plus an explicit @pytest.mark.write_probe marker on the test
CONFIRM_TARGET and CONFIRM_AUTHORIZED gates, behind a doctor.py preflight check
exclude_paths / exclude_tables on by default across every probe seam

Builder notes

Lessons learned

What carried forward

The extract → scrub → review release playbook, now the standard path for anything leaving a private repo.

Posts from this project

CASE STUDY

StackBadger: the tool that outlived its sprint

I built a black-box security harness to attack my own product. The product sprint ended. The tool kept earning commits — so I generalized it and published it.

JUN 2026 · 5 MIN
"A written rule is a suggestion. A gate is a control."
The operating principle behind every project here. The same bug shipped three times past written rules — and zero times past a CI gate. Deterministic enforcement beats advisory documentation, in agent harnesses and security programs alike.