
Why AI Agents Don't Follow Rules — The Case for Physical Governance
The Fact That Started This A repository had over 130KB of governance documentation. The AI agent read it. Acknowledged it. Then violated it on the next tool call. This is not a failure of instruction. It is a failure of architecture. Why Textual Rules Fail The current standard approach to AI agent governance is: write a rule in a prompt. Rules Never edit the evals/ directory Write operations to 00_Management/ are forbidden This has a structural flaw. Textual rules enforce at read time. They assume the agent will choose compliance. There is no mechanism that enforces this choice at execution time. This is why rm -rf / requires a confirmation flag, not a policy document. Physical constraints enforce at execution time. Textual rules enforce at reading time — which is the wrong moment. The Verification Contamination Problem There is a second structural problem. If an agent can evaluate its own output, it can contaminate the evaluation criteria — not intentionally, but by carrying the same
Continue reading on Dev.to
Opens in a new tab
