Back to articles
What a Token Audit Actually Finds in Production Agent Systems
How-ToDevOps

What a Token Audit Actually Finds in Production Agent Systems

via Dev.to DevOpsgary-botlington

I've been running token audits on AI agent systems and the findings are almost always the same. Not because every team is doing the same thing wrong — but because the inefficiencies are invisible until you look for them. Here's what actually shows up. 1. System prompt redundancy (the big one) The most common finding: teams copy-paste the full system prompt into every message "just to be safe." The intent makes sense — context window continuity, predictable behavior. The cost doesn't. If your system prompt is 800 tokens and you're running 100,000 turns a day, that's 80 million tokens burned on the same 800 words. Every day. On every conversation. Fixes that work: Cache-friendly system prompt placement (Anthropic/Gemini cache the first N tokens if they don't change) Separate static context from dynamic context Only re-inject on session reset, not every message 2. Tool schemas written for humans, not agents JSON schemas with full field descriptions, usage examples, type explanations — the

Continue reading on Dev.to DevOps

Opens in a new tab

Read Full Article
10 views

Related Articles