
Claude Code Is Silently Burning 10-20x Your Token Budget — Here's the Fix
If you're on Claude Max and wondering why your usage cap disappears in an hour, you're not alone. I went down this rabbit hole after burning through my entire daily quota in a single coding session. The Problem There are two independent bugs inflating Claude Code token usage right now: Bug 1: Broken prompt caching in the standalone binary The standalone Claude Code binary (installed via claude.ai/install.sh or npm install -g ) ships with Anthropic's custom Bun fork. Baked into this fork is a native-layer string replacement that modifies the JSON request body after serialization but before TLS encryption. It targets a billing attribution sentinel ( cch=24b72 ) and replaces it with a 5-character hash. The issue: it replaces the first occurrence in the serialized body. Since messages[] comes before system[] in the JSON, if your conversation history contains the sentinel string (from discussing billing, reading CC source, or having it in your CLAUDE.md), the wrong instance gets replaced. T
Continue reading on Dev.to
Opens in a new tab



