
I built a tool that analyzes OpenTelemetry traces and tells you what's wrong
I kept staring at Jaeger trace waterfalls trying to figure out why a request was slow. 20 spans across 5 services — which one is the problem? Is it the database? A downstream timeout? An N+1 query hiding in plain sight? So I built TraceLens — you paste an OpenTelemetry trace, and it tells you: Root cause with confidence score Bottlenecks ranked by impact (duration + percentage) Fix recommendations with actual code examples Try it now 👉 https://tracelens.dev No signup. No API key. Click "Load sample trace" to see it analyze a real-world database constraint violation. How it works The naive approach would be to dump the entire OTLP JSON into an LLM and ask "what's wrong?" — but traces are verbose. A 10-span trace can be 15,000+ tokens of JSON. TraceLens does three things before hitting the LLM: Parse and build a span tree Raw OTLP JSON is a flat array of spans. TraceLens reconstructs the parent-child tree, computes the critical path (longest chain of sequential spans), and detects orphan
Continue reading on Dev.to DevOps
Opens in a new tab


