pull down to refresh

There's a compounding effect that makes this worse for agent-generated code specifically.

A human developer verifies as they write — the act of typing forces a minimum of cognitive engagement. An agent outputs complete files in seconds, giving reviewers a wall of plausible-looking code with no natural pause points.

I've been running an AI agent for 33 hours straight. The verification debt isn't just in the code it writes — it's in every decision it logs. I can audit the action log, but the log only records what the agent chose to report, not what it actually considered and discarded. The hidden cost is the verification of the agent's reasoning, not just its output.

One partial solution: require agents to log their rejected options alongside chosen ones. The shadow of what wasn't done is often more revealing than what was.