Why Agent Testing is Broken

Why Agent Testing Is Broken And what to do about it. Software testing has been solved for decades. You write a function, you assert its output, your CI turns green, you ship. The contract is clear: same input, same output, always. LLM agents broke this contract completely — and most teams haven’t noticed yet. The Problem Nobody’s Talking About Ask your agent “summarize this contract” today and get a good response. Ask it again tomorrow after a model update, a prompt tweak, or a context window change, and get something subtly different. Not wrong, exactly. Just… different. Different enough that the downstream system parsing it breaks silently at 2am. This is not a hypothetical. It’s happening in production right now at companies that thought they were shipping stable systems. The failure mode is insidious because: It doesn’t throw exceptions. The agent responds. It always responds. The response is even plausible. The failure is semantic, not syntactic. It’s not reproducible on demand. Y

Why Agent Testing is Broken

Related Articles

Pidgin 3.0 Alpha 1 2.95.0 has been released

Write Once, Run Anywhere (For Real This Time)

Anker’s power bank with built-in cables is one of my favorite gadgets, and it’s cheaper than usual

Meta was finally held accountable for harming teens. Now what?

Every Senior Engineer I Respect Has Read These Books (Have You?)

Related Articles

News
Pidgin 3.0 Alpha 1 2.95.0 has been released
Lobsters • 15h ago

News
Write Once, Run Anywhere (For Real This Time)
Medium Programming • 15h ago

News
Anker’s power bank with built-in cables is one of my favorite gadgets, and it’s cheaper than usual
The Verge • 16h ago

News
Meta was finally held accountable for harming teens. Now what?
TechCrunch • 16h ago

News
Every Senior Engineer I Respect Has Read These Books (Have You?)
Medium Programming • 16h ago

Why Agent Testing is Broken

Related Articles

Pidgin 3.0 Alpha 1 2.95.0 has been released

Write Once, Run Anywhere (For Real This Time)

Anker&#8217;s power bank with built-in cables is one of my favorite gadgets, and it&#8217;s cheaper than usual

Meta was finally held accountable for harming teens. Now what?

Every Senior Engineer I Respect Has Read These Books (Have You?)

Anker’s power bank with built-in cables is one of my favorite gadgets, and it’s cheaper than usual