LLM Cost Monitoring with OpenTelemetry

Teams running LLM applications in production face a cost problem that traditional APM tools were never designed to solve. CPU and memory costs are relatively predictable — a web service processing 1,000 requests per second costs roughly the same week over week. LLM API costs are not. A single user session can cost $0.01 or $5 depending on prompt length, model choice, conversation history, and how many retries happen inside your chain. Without instrumentation, cost anomalies are invisible until the monthly invoice. The standard pattern: a team launches a feature using GPT-5, everything looks fine in staging, and then production traffic reveals that a small percentage of requests trigger long multi-turn conversations that cost 50× more than the average. By the time the bill arrives, the cost has already happened. OpenTelemetry's GenAI semantic conventions solve this at the instrumentation layer. The gen_ai.usage.input_tokens and gen_ai.usage.output_tokens attributes are captured automati

LLM Cost Monitoring with OpenTelemetry

Related Articles

Understanding Traceroute

Runahead Execution vs. Conventional Data Prefetching in the IBM POWER6 Microprocessor (2010)

WikiMapped – 1.3M geolocated Wikipedia articles on an interactive world map

Keychron’s hardware source

Flatpak: Complete Sandbox Escape

Related Articles

News
Understanding Traceroute
Lobsters • 6h ago

News
Runahead Execution vs. Conventional Data Prefetching in the IBM POWER6 Microprocessor (2010)
Lobsters • 6h ago

News
WikiMapped – 1.3M geolocated Wikipedia articles on an interactive world map
Lobsters • 6h ago

News
Keychron’s hardware source
Lobsters • 7h ago

News
Flatpak: Complete Sandbox Escape
Lobsters • 10h ago