
Your AI agent is refetching the same context on every run. Here's the fix.
I run a network of AI agents on cron schedules. Two months in, my token bills were 3-4x what they should have been. The culprit wasn't the work the agents were doing. It was the context reload at the start of every run. The problem Every loop, each agent was loading: MEMORY.md — full long-term memory (900+ tokens) SOUL.md — identity and persona (600+ tokens) TOOLS.md — tool reference (400+ tokens) Today's daily log — (200-500 tokens, growing throughout the day) That's 2,100+ tokens of overhead before the agent starts its actual task. For agents running every 15 minutes, that's 8,400+ tokens/hour per agent. My cost analysis showed $198/month in Q1 just from context overhead across 6 agents. The agents weren't expensive — their startup costs were. The fix: tiered context loading Not all context is needed on every run. Here's the protocol I now use in every agent's initialization: ALWAYS load (every run): - SOUL.md (~300 tokens) — identity and values - HEARTBEAT.md (~150 tokens) — current
Continue reading on Dev.to Webdev
Opens in a new tab



