
Cosine Similarity Failed Our RAG on Exact Terms — BM25 Fixed It
Our RAG Couldn't Find Its Own Documentation — Here's the Fix I built a local AI pipeline on top of Ollama. It has a knowledge base of markdown documents — session notes, architectural decisions, build logs. The idea was that the model could answer questions about its own project history using those documents as ground truth instead of hallucinating from parametric memory. It failed in a very specific way. The Failure I ran this query through the pipeline: "Why was nomic-embed-text chosen over mxbai-embed-large for the RAG embedding upgrade?" The answer exists verbatim in a session document: | nomic-embed-text over mxbai-embed-large | Available via Ollama, retrieval-trained, 768d, clean upgrade path | The cosine retrieval returned this: [knowledge/cosine] 'unrelated-project-context' score=0.5694 [HIT] ← wrong doc [knowledge/cosine] 'build-session' score=0.5634 [HIT] ← right file, wrong chunk [memory/cosine] 'old-session-notes' score=0.6018 ← wrong doc entirely The model received the wro
Continue reading on Dev.to Python
Opens in a new tab



![[MM’s] Boot Notes — The Day Zero Blueprint — Test Smarter on Day One](/_next/image?url=https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F1368%2F1*AvVpFzkFJBm-xns4niPLAA.png&w=1200&q=75)