How I built budget enforcement that actually works for AI APIs

I've been using Claude Code daily for 1 year across 30+ projects. When I checked what all those sessions would cost at API rates, the number was over $10,000. Claude Max subscribers have zero visibility into this. No dashboard, no breakdown, no way to know which project or session is burning the most tokens. So I built two things. An MCP server that shows Claude Code users their costs in real time, no API key needed, reads local session data directly. And an open-source API gateway called LLMKit with actual budget enforcement for teams routing traffic through AI providers. The budget layer took longer than everything else combined. Database locks, Redis counters, optimistic concurrency: nothing held up under concurrent agent traffic. The gap between "check balance" and "record cost" is where money disappears. Cloudflare Durable Objects turned out to be the answer. Why every other approach leaks money Standard flow in most AI proxies: Request comes in -> Read balance from DB (sees $12 u

How I built budget enforcement that actually works for AI APIs

Related Articles

My personal data has been leaked several times - this service helped clean it all up

Regex Cheat Sheet with Examples: The Complete 2026 Reference

RFK Jr. has destroyed over a quarter of health dept's expert panels

Sony’s WF-1000XM6 wireless earbuds are on sale for the first time

GL4: The Logical Core for Quaternary Optical Processors

Related Articles

News
My personal data has been leaked several times - this service helped clean it all up
ZDNet • 3h ago

News
Regex Cheat Sheet with Examples: The Complete 2026 Reference
Dev.to Tutorial • 4h ago

News
RFK Jr. has destroyed over a quarter of health dept's expert panels
Ars Technica • 6h ago

News
Sony’s WF-1000XM6 wireless earbuds are on sale for the first time
The Verge • 7h ago

News
GL4: The Logical Core for Quaternary Optical Processors
Medium Programming • 7h ago