
Tracing a RAG Chain End-to-End: Where OpenTelemetry Stops and Where You Need to Instrument Yourself
There are already plenty of "Getting started with OpenTelemetry" tutorials. This is not one of them. This article starts with a candid observation: if you have OTel running in your infrastructure and you've just added a RAG pipeline to production, your traces look impressive but they're mostly lying to you by omission. You have spans as latency numbers. What you don't have is visibility into the five stages that actually determine whether your system is working correctly. OTel wasn't designed for RAG. It was designed for distributed systems built around HTTP, databases, and message queues: all well-understood primitives with established semantic conventions. A RAG pipeline adds several new primitives that have no standard OTel semantics yet. The OpenTelemetry GenAI SIG is working on it, but slowly. In the meantime, production systems are running blind. The goal here is to be precise about where the boundary is and how to cross it. What a RAG chain actually traverses A minimal RAG pipel
Continue reading on Dev.to
Opens in a new tab




