Back to articles
LLM Cost Monitoring with OpenTelemetry

LLM Cost Monitoring with OpenTelemetry

via Dev.toAlexandr Bandurchin

Teams running LLM applications in production face a cost problem that traditional APM tools were never designed to solve. CPU and memory costs are relatively predictable — a web service processing 1,000 requests per second costs roughly the same week over week. LLM API costs are not. A single user session can cost $0.01 or $5 depending on prompt length, model choice, conversation history, and how many retries happen inside your chain. Without instrumentation, cost anomalies are invisible until the monthly invoice. The standard pattern: a team launches a feature using GPT-5, everything looks fine in staging, and then production traffic reveals that a small percentage of requests trigger long multi-turn conversations that cost 50× more than the average. By the time the bill arrives, the cost has already happened. OpenTelemetry's GenAI semantic conventions solve this at the instrumentation layer. The gen_ai.usage.input_tokens and gen_ai.usage.output_tokens attributes are captured automati

Continue reading on Dev.to

Opens in a new tab

Read Full Article
3 views

Related Articles