
Anthropic just dropped Claude Sonnet 5, and the benchmarks are kind of insane
Okay so Anthropic quietly pushed a blog post live this morning and I think it's flying under the radar a bit — Claude Sonnet 5 is officially out as of today. Model string is claude-sonnet-5-20260401 , already live in claude.ai as the new default and on the API at the same $3/$15 per million tokens pricing as Sonnet 4.6. No price hike. That part alone is worth stopping to think about. What actually changed The headline number is 92.4% on SWE-bench Verified . For context: Claude Opus 4.6, their previous flagship, sat at 80.8%. GPT-5.4 scores 57.7% on the same eval. Gemini 3.1 Pro is at 80.6%. Sonnet 5 just... leapfrogged all of them — including Anthropic's own Opus tier — at Sonnet pricing. That's a 12-point jump over Opus 4.6 in a single generation, from the mid-tier model. Computer use is the other big story. 88.3% on OSWorld-Verified . The human expert baseline on that benchmark is 72.4%, meaning Sonnet 5 isn't just competitive with humans on desktop automation — it's meaningfully ahe
Continue reading on Dev.to
Opens in a new tab



