I was paying $200/month in wasted AI tokens. So I built a Rust context optimizer.

My Cursor bill last month: $340. I dug into the API logs. Over 60% of the tokens were being sent to the LLM were: Boilerplate I'd copied from Stack Overflow three years ago The same database helper function, 4 slightly different times An entire test file that has nothing to do with what I was asking My AI tool was optimizing for similarity -- and similarity is not the same as information . The problem with every AI coding tool Cursor, Copilot, Claude Code, Cody -- they all select context the same way: Embed your query Find the top-K similar chunks Stuff them into the context window until full Cut everything else The result? Query: "How does payment processing work?" What your AI actually sees: auth.py (similarity: 0.94) <- useful auth_test.py (similarity: 0.91) <- copies auth logic auth_utils.py (similarity: 0.89) <- more auth copies auth_v2.py (similarity: 0.87) <- even more auth ... payments.py (similarity: 0.41) <- NEVER LOADED, cut by budget Your AI is answering questions about pay

I was paying $200/month in wasted AI tokens. So I built a Rust context optimizer.

Related Articles

Flames of Desire

The Night I Ran Out of Tokens

Why Judge Calibration Matters: Sonnet vs Opus — a Case Study

The Big Code Challenge : Round 1

Stop Comparing 13F Quarters Wrong. Here's the Right Way.

Related Articles

News
Flames of Desire
Medium Programming • 3h ago

News
The Night I Ran Out of Tokens
Medium Programming • 3h ago

News
Why Judge Calibration Matters: Sonnet vs Opus — a Case Study
Medium Programming • 4h ago

News
The Big Code Challenge : Round 1
Medium Programming • 4h ago

News
Stop Comparing 13F Quarters Wrong. Here's the Right Way.
Dev.to Tutorial • 4h ago