
Debugging an Invisible Scaling Limit on EKS
Introduction If you've ever tried scaling a deployment past 1000 pods on EKS and watched everything just... stop, with no errors, no warnings, and pods that look healthy but never actually receive traffic — this one's for you. I ran into this exact situation on a client's EKS cluster. The HPA was configured to scale well beyond 1000 replicas, and it did spin up the pods. They started fine, containers were healthy, but something was off: the readiness probes weren't even being evaluated. The pods were stuck in a limbo where they existed but didn't really exist as far as the load balancer was concerned. The cluster was running Kubernetes 1.33 with the AWS Load Balancer Controller v2.12, using IP-mode target registration behind an Application Load Balancer. That last detail turned out to matter a lot. Debugging The fact that exactly 1000 instances were spun up and registered normally — and none above — made me think about some kind of quota or limit being reached on the EKS or AWS level.
Continue reading on Dev.to DevOps
Opens in a new tab




