My 11-Agent AI Swarm Was Secretly Hallucinating. My Own Monitoring Tool Caught It.

I built an 11-agent swarm to write Reddit outreach for my product. It ran for weeks. It was hallucinating usernames the entire time — and I didn't notice until I ran a session diff comparing it to a 3-agent rewrite. This is what the diff showed me, and why I think most multi-agent systems have the same problem. The Setup The old system — call it V1 — was an 11-agent blackboard architecture: COMMANDER → SCOUT → CONVERSION_ANALYST → COPYWRITER → METACOG → EXECUTE → EVIDENCE_CHECK → SENTINEL → EXECUTIONER → VALIDATOR → ARBITER Each agent read the shared blackboard, added its output, and passed it forward. Architecturally, it looked impressive. In practice, each agent was doing one of two things: Restating what the previous agent said Inventing context that wasn't there The worst part: it had explicit directives saying never fabricate Reddit usernames . The directives were right there in the prompt, importance score 9.9 out of 10. It hallucinated usernames anyway. Every cycle. What the Ses

My 11-Agent AI Swarm Was Secretly Hallucinating. My Own Monitoring Tool Caught It.

Related Articles

Designing Game Economies: Why Spreadsheets Eventually Break

How to use Jinja2 Templates

Excel for beginners

The Constant Coastline

I measured M-Pesa STK Push polling lag on a real device. The variance will ruin your UX.

Related Articles

How-To
Designing Game Economies: Why Spreadsheets Eventually Break
Dev.to • 2h ago

How-To
How to use Jinja2 Templates
Dev.to Tutorial • 2h ago

How-To
Excel for beginners
Dev.to Beginners • 3h ago

How-To
The Constant Coastline
Dev.to • 3h ago

How-To
I measured M-Pesa STK Push polling lag on a real device. The variance will ruin your UX.
Dev.to • 3h ago