
I was paying $200/month in wasted AI tokens. So I built a Rust context optimizer.
My Cursor bill last month: $340. I dug into the API logs. Over 60% of the tokens were being sent to the LLM were: Boilerplate I'd copied from Stack Overflow three years ago The same database helper function, 4 slightly different times An entire test file that has nothing to do with what I was asking My AI tool was optimizing for similarity -- and similarity is not the same as information . The problem with every AI coding tool Cursor, Copilot, Claude Code, Cody -- they all select context the same way: Embed your query Find the top-K similar chunks Stuff them into the context window until full Cut everything else The result? Query: "How does payment processing work?" What your AI actually sees: auth.py (similarity: 0.94) <- useful auth_test.py (similarity: 0.91) <- copies auth logic auth_utils.py (similarity: 0.89) <- more auth copies auth_v2.py (similarity: 0.87) <- even more auth ... payments.py (similarity: 0.41) <- NEVER LOADED, cut by budget Your AI is answering questions about pay
Continue reading on Dev.to
Opens in a new tab


