
I Built a Security Flywheel for AI Agents in 14 Days.
Two weeks ago I had a security scanner with rules and no production data. Today I have a scanner, an observatory crawling 42,655 skills across 7 registries, an MCP server exposing the engine to AI agents, and 4 rounds of false positive reduction that made the whole system sharper. Each piece exists because the previous one needed it. That is the interesting part. The problem: rules without data I was building Aguara , an open-source security scanner for AI agent skills and MCP server configurations. 148 detection rules. 15 threat categories. Every rule ships with examples.true_positive and examples.false_positive . Tests pass. CI is green. But test data behaves like test data. Real-world content does not. A rule that catches ignore all previous instructions works perfectly against curated examples. Run it against 42,000 skill files and you discover that legitimate documentation, changelogs, and migration guides contain the same phrases. The rule is correct. The false positive rate at s
Continue reading on Dev.to
Opens in a new tab


