38 Issues: Showdown between BugBot, Copilot and Claude

AI code review tools promise to catch what human reviewers miss. But which one actually delivers? I planted 38 deliberate bugs, security vulnerabilities, and code smells into a .NET 10 codebase — then let three AI reviewers loose on the same PR. Here's what happened. Why This Comparison? Every major platform now offers AI-powered code review: GitHub has Copilot, Cursor has BugBot, and Anthropic has Claude. They all claim to catch security issues, bugs, and code quality problems. But marketing aside, I wanted answers to three practical questions: How many issues does each tool actually catch? Not in a curated demo — in a realistic PR with a mix of critical vulnerabilities and subtle code smells. How do they behave across multiple review cycles? A first pass is one thing. What happens when you fix the findings and re-request a review? What's the developer experience like? Detection rate is a number. But does the tool actually help you ship with confidence? To find out, I designed a contr

38 Issues: Showdown between BugBot, Copilot and Claude

Related Articles

The Asylum...and Real Life

Breaking Down 20 Real-World Systems: Search, Payments, Messaging & More

HI Dev

The Health Check That Always Returned 200 OK (Even When Everything Was Broken)

The Accountability Gap Between Product and Engineering

Related Articles

News
The Asylum...and Real Life
Medium Programming • 1d ago

News
Breaking Down 20 Real-World Systems: Search, Payments, Messaging & More
Medium Programming • 1d ago

News
HI Dev
Dev.to Beginners • 1d ago

News
The Health Check That Always Returned 200 OK (Even When Everything Was Broken)
Medium Programming • 1d ago

News
The Accountability Gap Between Product and Engineering
Medium Programming • 1d ago