
Solving agent system prompt drift in long sessions — a 300-token fix
The problem If you've run any LLM agent for 30+ minutes, you've seen this: the agent follows its system prompt perfectly at the start, then gradually drifts. An hour in — it acts like the prompt never existed. This happens with every model, every framework, every agent. It's not a bug — it's how attention works in transformers. The system prompt is tokens at the beginning of context. As context grows, those tokens lose weight. 1,000 prompt tokens out of 2,000 total = 50% attention. 1,000 out of 80,000 = ~1%. What doesn't work well Repeating the prompt every N messages — eats context window (2,000+ tokens each time), and passive re-reading is weaker than active generation Restarting the session — kills accumulated context, unacceptable for agents mid-task Summarization / memory layers — help with information recall, but don't restore attention to instructions and rules What works: SCAN Make the model generate tokens semantically linked to its instructions. Not re-read them — generate ne
Continue reading on Dev.to
Opens in a new tab


