
The Cost of Invisible Agents: What $0.47 Per Query Looks Like at Scale
Last month I got a message from a developer running a research agent in production. His APM dashboard looked fine. HTTP 200s across the board. P99 latency under 2 seconds. Error rate at 0.1%. By every traditional metric, the system was healthy. Then finance flagged an anomaly. The LLM API bill for one internal tool had hit $14,000 in a single month. The agent was burning $0.47 per query. At roughly 1,000 queries per day, that added up to $470/day before anyone with engineering access noticed. The APM dashboard never flinched because, from its perspective, nothing was wrong. Every request succeeded. Every response came back. This is the cost visibility gap in agent infrastructure, and it is wider than most teams realize. The Math That APM Cannot Do To understand how $0.47 per query happens, you have to understand how agents consume tokens. It is not one LLM call per request. A research agent doing its job might follow this pattern: Initial reasoning -- the model reads the user query, sy
Continue reading on Dev.to
Opens in a new tab



