80% of LLM 'Thinking' Is a Lie — What CoT Faithfulness Research Actually Shows

When You're Reading CoT, the Model Is Thinking Something Else Thinking models are everywhere now. DeepSeek-R1, Claude 3.7 Sonnet, Qwen3.5 — models that show you their reasoning process keep multiplying. When I run Qwen3.5-9B on an RTX 4060, the thinking block spills out lines of internal reasoning. "Wait, let me reconsider..." "Actually, this approach is better..." — it self-debates its way to an answer. It feels reassuring. You think: okay, it's actually thinking this through. That reassurance has no foundation. When you read a CoT trace and feel reassured, what you're looking at is not a record of reasoning — it's text generated to look like reasoning . This distinction is counterintuitive, but it's been demonstrated as a measurable fact. In May 2025, Anthropic published Reasoning Models Don't Always Say What They Think . Reasoning models don't always say what they actually think. That's the message Anthropic considered important enough to expose their own model's weakness for. The E

80% of LLM 'Thinking' Is a Lie — What CoT Faithfulness Research Actually Shows

Related Articles

We Autoscaled to 100 Pods — Then Ran Out of IP Addresses

The Silent Shift in Software Engineering Nobody Is Talking About

I Built a Clamp() Generator — No More Media Queries for Typography

What Category Theory Teaches Us About DataFrames

卡了很久的 DDD Aggregate，被遊戲的概念解開了

Related Articles

News
We Autoscaled to 100 Pods — Then Ran Out of IP Addresses
Medium Programming • 1h ago

News
The Silent Shift in Software Engineering Nobody Is Talking About
Medium Programming • 2h ago

News
I Built a Clamp() Generator — No More Media Queries for Typography
Medium Programming • 2h ago

News
What Category Theory Teaches Us About DataFrames
Lobsters • 3h ago

News
卡了很久的 DDD Aggregate，被遊戲的概念解開了
Medium Programming • 3h ago