
Your Bedrock Bill Is a Ticking Clock — Here's How to Stop It
You deploy a Lambda that calls Bedrock. It works beautifully in testing. Then someone runs a batch job, a retry loop goes wrong, or traffic spikes and your AWS bill at the end of the month looks like a phone number. Bedrock has no built-in spend cap. No circuit breaker. No "stop after $X." It will happily invoke your model ten thousand times before you notice anything is wrong. This post is about the patterns that prevent that applied specifically to serverless AI workloads on AWS. Why Bedrock Cost Blowups Happen Bedrock charges per input token and output token. The pricing varies by model: Model Input (per 1K tokens) Output (per 1K tokens) Claude Haiku ~$0.00025 ~$0.00125 Claude Sonnet ~$0.003 ~$0.015 Claude Opus ~$0.015 ~$0.075 Haiku looks cheap and it is, until you're running it at scale with large prompts. A 2,000 token prompt + 500 token response at Haiku pricing is about $0.0007 per call. At 100,000 calls per day that's $70/day, $2,100/month. From a single Lambda function. The th
Continue reading on Dev.to
Opens in a new tab


