Your AI SRE Doesn't Need One Model — It Needs the Right Model for Each Job
Your AI SRE Doesn't Need One Model — It Needs the Right Model for Each Job We built our first AI SRE integration with a single model. Opus for everything — incident triage, Kubernetes debugging, IAM policy review, cost anomaly detection. Figured we'd use the best available and not overthink it. Three months in, the cost was real. And honestly, most of the tasks didn't need Opus-grade reasoning. Checking if a pod is in CrashLoopBackOff doesn't require the same cognitive load as parsing a complex cross-account IAM policy trust relationship. Rootly published benchmark results this week that put actual numbers on a hunch most of us have been carrying. If you're building AI SRE tooling — or about to — the findings are worth sitting with. What the Benchmarks Found Rootly ran Claude Sonnet 4.6 and Opus across four infrastructure task types: Kubernetes, IAM/S3 policy, compute, and general infra work. The finding: Sonnet 4.6 performs comparably to Opus on Kubernetes and compute tasks. The gap o
Continue reading on Dev.to
Opens in a new tab




