Back to articles
80% of LLM 'Thinking' Is a Lie — What CoT Faithfulness Research Actually Shows

80% of LLM 'Thinking' Is a Lie — What CoT Faithfulness Research Actually Shows

via Dev.toplasmon

When You're Reading CoT, the Model Is Thinking Something Else Thinking models are everywhere now. DeepSeek-R1, Claude 3.7 Sonnet, Qwen3.5 — models that show you their reasoning process keep multiplying. When I run Qwen3.5-9B on an RTX 4060, the thinking block spills out lines of internal reasoning. "Wait, let me reconsider..." "Actually, this approach is better..." — it self-debates its way to an answer. It feels reassuring. You think: okay, it's actually thinking this through. That reassurance has no foundation. When you read a CoT trace and feel reassured, what you're looking at is not a record of reasoning — it's text generated to look like reasoning . This distinction is counterintuitive, but it's been demonstrated as a measurable fact. In May 2025, Anthropic published Reasoning Models Don't Always Say What They Think . Reasoning models don't always say what they actually think. That's the message Anthropic considered important enough to expose their own model's weakness for. The E

Continue reading on Dev.to

Opens in a new tab

Read Full Article
6 views

Related Articles