Five days, 140 commits, one detective's report

In May, someone close to me was being stalked. I'm a security professional; the useful thing I had to offer was turning a chaotic pile of open-source information into something organized enough for law enforcement to act on. So I built a tool. Five days, 140 commits, one report delivered to a detective.

This post stays deliberately at the pattern level — no case details, nothing that could identify anyone. The patterns are the transferable part anyway.

Victim safety before investigation

The first thing the tool produces isn't intelligence — it's a hardening playbook for the person being stalked: locking down accounts, closing off the information the stalker is feeding on. That ordering is enforced by the pipeline's own structure — the protective phases run before any investigative one. When you're angry on someone's behalf the temptation is to hunt; the discipline is to protect first. And discipline you have to remember isn't discipline, so I made it structural.

The false-confidence crisis

The first real run produced thousands of "high confidence" results that were mostly wrong. The root cause was a definition error: the confidence score described how reliable each scanning technique was — not whether a hit actually belonged to the person being investigated. (One popular username scanner has a measured false-positive rate around a third: fine for a hobby lookup, disqualifying for evidence.) Those are different claims, and conflating them would have put garbage in front of a detective with my name attached to it.

That same evening I rebuilt the scoring around a distinction borrowed from the Berkeley Protocol — the UN's standard for digital open-source investigations — which insists you keep the source of a claim separate from its content. In code that became two independent axes: does this account exist (confirmed / likely / uncertain), and is it actually them (strong / probable / possible / unattributed) — the second held to a far higher bar than the first. In evidence work, a confident wrong answer is worse than no answer.

Restraint as architecture

The decisions that mattered most were about what the tool refuses to do — and they're encoded, not merely intended:

It never scrapes platforms that require a login. Instead it generates subpoena guidance: what a detective should request from each platform, and through which legal process. "Tell the detective what to ask for" replaced "go get it."
Dangerous actions are off by default and auditable when used. Anything that could tip off the subject — a probe that might fire a password-reset email, say — takes two separate switches to enable, and the reason is recorded either way.
Everything runs locally. No cloud, no third party holding a victim's data — and where a tool would have shipped data to an outside service, the wrapper refuses to run rather than quietly stripping the offending setting. A build check even greps the code for any network call not on an explicit allowlist, so a stray new egress surfaces in the diff.
Chain of custody as code. Every tool run writes an append-only manifest — tool and version, the exact command, a SHA-256 of the output, a UTC timestamp — and the finished artifacts are made read-only at the OS level. The standard comes straight from forensic practice: if you didn't write it down, it didn't happen.
The scoring engine has negative controls, or it convicts the innocent. Coincidental name and username matches are pushed down with negative weights, and a hard cap keeps a cluster built only on coincidence from ever reading as more than "possible."

Why it stays private

The tool worked. The report was delivered. And the repository stays private — it holds a real case, and a generalized public version would mostly be a gift to the kind of person it was built to investigate. Knowing when not to ship is a product decision too.

The lesson I keep from the fastest build of my life: speed was never the constraint. Judgment was — about what "confidence" is allowed to mean, about what not to automate, about who the tool is really for. Software that touches a real person's safety should be the most opinionated software you build.