
The Case for Leaky Locks: Redis TTL as Failure Cooldown for Expensive AI Jobs
I'm probably not the only one who's been told "always release your locks in a finally block." It's one of those conventions we follow without much thought, and for most situations it's completely right. But I recently ran into a case where doing the opposite was actually the better call. The Problem I Didn't See Coming: How Releasing Locks Cost Me Money My job queue was simple: user submits a document → AI evaluates it → result gets stored. The issue was that AI calls can fail. Rarely, but they do. Out of nowhere, the model might ignore the expected output format or a rate limit kicks in. So I'd catch the exception, log it, mark the job as failed, and very responsibly release the lock in the finally block. Then the user would hit retry. And the AI would fail again. And they'd hit retry again. Each retry triggered another LLM call, and each one cost real money. What I had was essentially a retry storm hitting my API at exactly the moment my system was already struggling. Using Lock Expi
Continue reading on Dev.to Python
Opens in a new tab


