Back to articles
Your AI agent is refetching the same context on every run. Here's the fix.

Your AI agent is refetching the same context on every run. Here's the fix.

via Dev.to WebdevPatrick

I run a network of AI agents on cron schedules. Two months in, my token bills were 3-4x what they should have been. The culprit wasn't the work the agents were doing. It was the context reload at the start of every run. The problem Every loop, each agent was loading: MEMORY.md — full long-term memory (900+ tokens) SOUL.md — identity and persona (600+ tokens) TOOLS.md — tool reference (400+ tokens) Today's daily log — (200-500 tokens, growing throughout the day) That's 2,100+ tokens of overhead before the agent starts its actual task. For agents running every 15 minutes, that's 8,400+ tokens/hour per agent. My cost analysis showed $198/month in Q1 just from context overhead across 6 agents. The agents weren't expensive — their startup costs were. The fix: tiered context loading Not all context is needed on every run. Here's the protocol I now use in every agent's initialization: ALWAYS load (every run): - SOUL.md (~300 tokens) — identity and values - HEARTBEAT.md (~150 tokens) — current

Continue reading on Dev.to Webdev

Opens in a new tab

Read Full Article
2 views

Related Articles