
Two Ends of the Token Budget: Caveman and Tool Search
Every Claude Code session has a single budget: the context window. Two hundred thousand tokens, give or take, that have to hold the system prompt, the tool definitions, the conversation history, the user's input, the model's output, and (if extended thinking is on) the chain of thought. There is exactly one pile, and everything gets withdrawn from it. The pile has two openings. Tokens flow in from the system side: tool schemas, system prompt, prior turns, files the model read. And tokens flow out from the model side: explanations, code, commit messages, plans. Both sides count against the same total. Both sides eat budget. Two projects look at this single budget from opposite ends. The first is Caveman , a Claude Code plugin that makes the model talk like a caveman. "Why use many token when few do trick." The mechanism is a prompt that tells the model to drop articles, filler, hedging, and pleasantries while keeping technical substance intact. The README claims ~75% output token saving
Continue reading on Dev.to
Opens in a new tab



