Back to articles
GoodMonkey - 57% Reduction* in Claude Code Context via Extensible Proxy

GoodMonkey - 57% Reduction* in Claude Code Context via Extensible Proxy

via Dev.to PythonScot Campbell

I've been running Claude Code since it launched — long sessions, heavy tool use, complex multi-file work. And I kept hitting the same wall: around 200 turns, the model starts losing track. Responses slow down. It forgets decisions from 50 turns ago. Eventually, compaction kicks in and summarizes everything, which loses detail I actually need. So I dug into what was actually being sent to the API. Turns out, 89-91% of every request payload is content the model has already processed and will never reference again. Old grep results. File reads from 100 turns back. Thinking blocks that produced a response that's already in the conversation. All of it riding along on every single API call, diluting the model's attention. I decided to do something about it. What GoodMonkey Does GoodMonkey is a local HTTP proxy that sits between your LLM agent and the Anthropic API. Requests and responses flow through a plugin pipeline where each plugin can inspect, transform, or block content — transparently

Continue reading on Dev.to Python

Opens in a new tab

Read Full Article
0 views

Related Articles