SECURITY

1 of 16: auditing my own guardrails

JUNE 11, 2026 · 5 MIN · CYBERREADYAI
TL;DR

Most controls are added after the incident they would have prevented — and most of them are theater. Audit your own guardrails as ruthlessly as you'd audit a client's.

16 guardrails, run through a red-team for evidence they ever worked.

Fifteen years of security consulting means I've audited a lot of other people's controls. This spring I turned the same lens on my own — the safety guardrails I'd built to stop AI coding agents from damaging my codebase. The results were humbling.

The setup

When you run AI agents against a real codebase all day, they occasionally do destructive things. The one that got my attention: a git worktree remove --force that a background agent was still using zeroed out a file and erased about 1,570 lines of work. So I started adding guardrails — PreToolUse hooks, checks that intercept a dangerous command before it runs. By the end I had roughly sixteen.

The audit

When I shelved the product, I audited the hooks the way I'd audit a client's control environment — not "do these exist?" but "is there any evidence one of them ever worked?"

~16 agent guardrails, audited for evidence of working
1 · KEPThard block on `git stash` — zero recurrences after
3 · BRITTLEblocked legitimate work — needed escape-hatch patches
~9 · THEATERwarn-only — no commit evidence any caught a defect

1 of 16Only the hook that blocked a specific destructive command earned its keep. The rest either warned and got ignored, or misfired and got disabled.

The settings file holding these hooks was the most-edited file in the entire project — about thirty changes, more than half of them fixes to the hooks themselves, including fixes for false alarms the hooks were raising. A control that needs constant maintenance to stop attacking your own work is not yet a control.

Tombstone controls

The pattern underneath was the uncomfortable part: nearly every hook was added immediately after the incident it would have prevented. Each one is a tombstone — a marker of where something already died. That's not defense in depth; it's grief with a config file.

Enterprises do exactly this. The post-breach security stack is full of tools bought the week after an incident, each aimed at yesterday's attack, most never tuned, many generating alerts nobody reads. My sixteen hooks were a one-person version of the same ritual.

The rubric that survived

The audit ended with a bar for any future guardrail — and by that bar I should have built three, not sixteen:

  1. It blocks; it doesn't warn. A warning is a suggestion, and suggestions don't stop incidents.
  2. It gets attacked on day one. If I can't watch it catch the bad thing before I rely on it, it isn't a control yet.
  3. It survives without false-positive patches. A guardrail that keeps blocking real work will get an exception carved into it — and exceptions are how controls quietly stop existing.

The lesson isn't "build fewer guardrails." It's that a control without evidence is just a belief — and beliefs don't stop incidents, in an agent harness or an enterprise security program.

"A written rule is a suggestion. A gate is a control."
The operating principle behind every project here. The same bug shipped three times past written rules — and zero times past a CI gate. Deterministic enforcement beats advisory documentation, in agent harnesses and security programs alike.