
The 15-minute problem: how to decide whether to rollback after deploy
Every engineer knows this feeling. You just shipped to production. CI passed. The deploy finished clean. And now you're doing the thing nobody talks about — staring at dashboards for the next 15 minutes, waiting to see if anything breaks. Error rate graph. Refresh. Looks okay? Maybe. Slack is quiet. Should I go back to work? What if something blows up the moment I look away? This is the 15-minute problem . And almost every team I've talked to handles it the same way: manually, anxiously, inconsistently. Why the first 15 minutes are different Post-deploy is not like normal production monitoring. The questions you're asking are different: Is what I'm seeing caused by this deploy , or was it already there? Is this error rate actually elevated , or is it noise? Should I rollback now , or give it more time? Tools like Datadog or Grafana are great at showing you what's happening. But they don't answer the deploy-specific question: is this deploy okay or not? You still have to decide. And wit
Continue reading on Dev.to DevOps
Opens in a new tab




