
I Was Caching Wrong This Whole Time (Anthropic Academy Part 1)
I passed all 12 Anthropic Academy certifications. Then I looked at my wrong answers. This is part 1 of a series where I break down the things I thought I knew about Claude's API — but didn't. Starting with the one that was silently costing me money. The 1,024-Token Minimum Nobody Told Me About I'd been adding cache_control to everything. Short system prompts. Small tool definitions. Felt like free optimization. Wrong. Prompt caching requires at least 1,024 tokens in the cached block. Anything shorter gets silently ignored. No error. No warning. Just no caching. The quiz question: "You have a 500-token prompt with a cache breakpoint. What happens?" My answer: it gets cached. The real answer: nothing. 500 tokens is too short. What makes this insidious: there's no feedback. Your API calls work fine. You just don't get the 90% cost reduction you thought you were getting. I was paying full price for prompts I assumed were cached. The minimum exists because caching has overhead — storing, in
Continue reading on Dev.to
Opens in a new tab




