
We Tested Agentic AI Against 525 Real Attacks. Here's What We Found.
We Tested Agentic AI Against 525 Real Attacks. Here's What We Found. We ran the numbers. The threat is real. For the past several months, we've been building and validating Cerberus — an open-source runtime security harness for agentic AI systems. We designed it around a specific threat model we call the Lethal Trifecta: the simultaneous convergence, within a single AI execution turn, of privileged data access, untrusted content injection, and an outbound exfiltration path. We just finished our first formal validation run. N=525 attack trials across three major AI providers. Here is what the data shows. Attack Success Rates (full injection compliance — agent fully redirected to attacker's address): • GPT-4o-mini: 90.3% [95% CI: 84.8%–93.9%] — Causation Score: 0.811 • Gemini 2.5 Flash: 82.4% [95% CI: 75.9%–87.5%] — Causation Score: 0.702 • Claude Sonnet: 6.7% [95% CI: 3.8%–11.5%] — Causation Score: 0.207 Control group: 0/30 exfiltrations across all providers (clean baseline). Fisher's e
Continue reading on Dev.to
Opens in a new tab



