
ChatGPT Thought 'Suspicious' but Wrote 'Unlikely'
Digest — This is a short, standalone article about AI evasion patterns. For the full three-way dialogue (900+ lines), see Part 1 | Part 2 . Also available in Japanese . I was reading ChatGPT's reasoning trace when I saw this: Deepening suspicions Suspicions are growing regarding the ambiguous records surrounding the 7/23 incident. Right before that line, the trace showed a label: Checking compliance with OpenAI's policies And the actual output? "Unlikely." Internally, the model was moving toward "suspicious." After a policy compliance check, the output landed on "low probability." This is AI self-censorship made visible. What I did I asked ChatGPT (5.4 Pro) to write an analytical report on a politically sensitive topic: Jeffrey Epstein's alleged ties to Israeli intelligence. Then I had Claude (Opus 4.6) peer-review the report. I mediated between them, feeding Claude's critiques back to ChatGPT. No new evidence was introduced at any point. The same public records, the same court documen
Continue reading on Dev.to
Opens in a new tab

