Back to articles
Your LLM prompts are probably wasting 90% of tokens. Here’s how I fixed mine.

Your LLM prompts are probably wasting 90% of tokens. Here’s how I fixed mine.

via Dev.toRoTSL

I keep running into the same problem with LLM apps. This work is based on my previous article on dev.to https://dev.to/rotsl/contextfusion-the-context-brain-your-llm-apps-are-missing-2gkm You build a retrieval pipeline, hook it up to an API, and then quietly ship prompts that are full of stuff the model doesn’t need. Extra chunks. Duplicates. Half-relevant context that just bloats everything. And you pay for all of it. CFAdv is basically an attempt to stop doing that. It builds on context-fusion, but adds something that turns out to matter more than I expected: even if you pick the right context, you can still mess it up by putting it in the wrong place. Most pipelines are still doing this Let’s be honest about the default pattern: chunks = retriever . top_k ( query , k = 5 ) prompt = " \n\n " . join ( chunks ) response = llm ( prompt ) That’s it. No budget. No filtering beyond retrieval. No thought about ordering. More context is assumed to be better. It often isn’t. CFAdv splits the

Continue reading on Dev.to

Opens in a new tab

Read Full Article
8 views

Related Articles