
First Principles of AI Context
Every few weeks someone publishes a benchmark showing that the latest model is smarter, faster, more capable. Context windows are getting massive. A million tokens, two million, more on the horizon. And that’s genuinely impressive. But it raises a question nobody seems to be asking: what are we filling those windows with? Right now, the answer is mostly everything. Dump in the docs. Stuff in the chat history. Append the tool definitions. Hope the model figures out what matters. Bigger windows don’t solve the context problem. They just give you more room to be wrong. A million tokens of unfocused, unstructured context isn’t better than ten thousand tokens of the right context. It’s worse, because the model has to work harder to find the signal in the noise, and you’re paying for every token of that noise. I’ve spent the last year building agent infrastructure, and I keep landing on the same conclusion: the bottleneck isn’t the model and it isn’t the window size. It’s the quality and str
Continue reading on Dev.to
Opens in a new tab




