Claude CLI vs API for Code Review: Same Model, Wildly Different Results

I stopped writing code by hand a while ago. Claude writes it, I review it, it ships. It works, so why should I? But here's the thing -- if AI writes all the code, who reviews it? Another AI, obviously. So I built brunt , an adversarial code review tool that throws LLMs at your diffs to find bugs and security issues. The problem is: which AI do you point it at? I have a Claude subscription (CLI access), and I have an API key. Same company, same models. Should give the same results, right? I also gave Ollama a try, didn't make the cut. I tested this against a real refactor on my Rust/Axum backend -- replacing four old subsystems with a new AI scenarios feature. 20 commits, 77 files, +1,566 / -5,900 lines. I ran brunt three ways: Claude CLI -- uses your Claude subscription via claude -p Anthropic API (Sonnet) -- claude-sonnet-4-6 via HTTP Anthropic API (Opus) -- claude-opus-4-6 via HTTP Same diff. Same tool. Same prompts. Wildly different results. The results Seven findings vs eighty-four

Claude CLI vs API for Code Review: Same Model, Wildly Different Results

Related Articles

Build Days That Actually Mean Something

I have blogged about the difference between code coverage and test coverage and why it matters to distinguish between these 2.

The origin story of Apple’s long-running relationship with FoxConn

How to Optimize Big Data Platform Costs Across the Data Lifecycle

Switzerland — Best Crypto Exchange (2026)

Related Articles

How-To
Build Days That Actually Mean Something
Medium Programming • 4h ago

How-To
I have blogged about the difference between code coverage and test coverage and why it matters to distinguish between these 2.
Dev.to Beginners • 9h ago

How-To
The origin story of Apple’s long-running relationship with FoxConn
The Verge • 9h ago

How-To
How to Optimize Big Data Platform Costs Across the Data Lifecycle
Hackernoon • 9h ago

How-To
Switzerland — Best Crypto Exchange (2026)
Dev.to Beginners • 13h ago