Anarchy, Assembly Lines, and Corporate Hierarchy: Benchmarking Multi-Agent Architectures for Medical Device Data

via Dev.toMartin Nanchev2h ago

My AI judge gave the anarchists a perfect score. I disagree. I built three multi-agent systems to analyze data from my insulin pump — a Medtronic MiniMed 780G — and had an LLM evaluate their output. The cheapest, fastest architecture scored identically to the most expensive one. But when I read the actual reports, the cheap one guessed where the expensive one calculated. The evaluator didn't care. That tension — between automated scores and human judgment — turned out to be the most interesting finding of this experiment. But let's start from the beginning. A Fair Fight This Time In my previous blog post , I compared a swarm architecture with a graph pipeline for analyzing CareLink CSV exports. The problem? I used different models for each, which made the comparison unfair. This time, every agent runs on the same model: Haiku 4.5 via AWS Bedrock. Same prompts, same tools, same data. The only variable is the orchestration pattern. A LinkedIn commenter also suggested trying prompt cachin

Continue reading on Dev.to

Opens in a new tab

Read Full Article

2 views

Anarchy, Assembly Lines, and Corporate Hierarchy: Benchmarking Multi-Agent Architectures for Medical Device Data

Related Articles

COBOL Migration in 2026: A $3,000 Desktop Tool vs. Six-Figure Consulting Engagements

From Scattered Files to a Simple Organized System: How I Organize Lecture Notes, Past Questions…

Three Sum

The Dev Tools Everyone Loves — But I Stopped Using (And What I Use Instead)

The Characters Sets in any programing language like C