We analyzed 10,000 voice AI calls. The LLM was rarely the problem.

We built Dograh OSS , an open-source voice AI platform. When we started, we assumed most failures would come from the LLM - bad answers, missed intent, prompt edge cases. So we spent a lot of early effort there. Then we looked at the data. We ran automated QA where an LLM reviews every turn in every call and tags what went right and wrong, and we spent hours listening to calls ourselves. Across roughly 10,000 calls spanning customer support, appointment booking, and lead qualification, the failure picture looked nothing like what we expected. The problems that showed up again and again were about the phone call as a medium. Timing, audio physics, and infrastructure designed decades before LLMs existed. Here is what we found, roughly ranked by frequency. Failure area Share Primary driver STT / word error rate ~38% Low-quality telephony audio and accent variation First-8-second chaos ~34% Greeting latency, barge-in, variable user behavior Interruption handling ~28% Filler words breaking

We analyzed 10,000 voice AI calls. The LLM was rarely the problem.

Related Articles

Backward Compatibility in Go: What to Know

SteelSeries’ feature-packed Nova Pro Wireless headset is $80 off

PostGIS Distance Calculations: Why ST_Distance Returns Degrees Instead of Meters

Best Block Blast Solver (2026) Instantly Solve Any Level

The Markup Wins Six News Design Awards From the Society for News Design

Related Articles

News
Backward Compatibility in Go: What to Know
Hackernoon • 1h ago

News
SteelSeries’ feature-packed Nova Pro Wireless headset is $80 off
The Verge • 1h ago

News
PostGIS Distance Calculations: Why ST_Distance Returns Degrees Instead of Meters
Medium Programming • 2h ago

News
Best Block Blast Solver (2026) Instantly Solve Any Level
Medium Programming • 2h ago

News
The Markup Wins Six News Design Awards From the Society for News Design
Hackernoon • 2h ago