Back to articles
How Kubernetes Drift Detection Saved Us From Infrastructure Chaos
How-ToDevOps

How Kubernetes Drift Detection Saved Us From Infrastructure Chaos

via Dev.to DevOpsJay French

Three months into a production migration, we discovered that 14 of our 47 deployments had quietly drifted from their declared state. Not in a dramatic, pager-firing way. In the slow, invisible way that turns a Tuesday afternoon into a Friday incident. That's the thing about configuration drift. It doesn't announce itself. It accumulates. Here's what happened, what we built to fix it, and why I think most teams are one bad deploy away from the same problem. The Setup We were running a mid-sized Kubernetes cluster across three environments: dev, staging, and production. Standard GitOps workflow. ArgoCD handling deployments. Helm charts checked into Git. Everything was "declarative." Everything was "source-of-truth." Except it wasn't. Engineers were patching things manually under pressure. kubectl edit became a habit. Resource limits got tweaked directly on pods. ConfigMaps were updated in-cluster without touching the repo. Nobody flagged it because nothing broke. The cluster kept humming

Continue reading on Dev.to DevOps

Opens in a new tab

Read Full Article
9 views

Related Articles