Agents Don't Fail at AI — They Fail at DevOps

When people imagine AI agents failing, they picture the wrong things. They imagine hallucinations, confused responses, bad reasoning. Those happen, but they are not where production systems actually break. Production agents fail at DevOps. Orphaned processes nobody is watching. Context windows that hit the ceiling mid-task. Auth tokens that expire silently and cause agents to fail for hours before anyone notices. Logs that fill disks. Services that restart into broken state and stay there. I have been running 23 agents in production across five businesses for six months. The model has almost never been the problem. The ops layer has been the problem, repeatedly, in ways that were entirely preventable. Here is what broke and how I fixed it. Failure 1: Orphaned Processes The first major incident was an agent that got stuck in a loop. The session appeared active, was consuming API credits, but was not producing useful output. Nobody noticed for almost four hours because there was no alert

Agents Don't Fail at AI — They Fail at DevOps

Related Articles

WebPKI and You

10 Ways Grace Hopper Pioneered the Invention of COBOL

Do we still use StackOverflow?

Palmer Luckey’s retro gaming startup ModRetro reportedly seeks funding at $1B valuation

Cakelisp

Related Articles

News
WebPKI and You
Lobsters • 17h ago

News
10 Ways Grace Hopper Pioneered the Invention of COBOL
Medium Programming • 17h ago

News
Do we still use StackOverflow?
Dev.to • 18h ago

News
Palmer Luckey’s retro gaming startup ModRetro reportedly seeks funding at $1B valuation
TechCrunch • 19h ago

News
Cakelisp
Lobsters • 20h ago