
38 Issues: Showdown between BugBot, Copilot and Claude
AI code review tools promise to catch what human reviewers miss. But which one actually delivers? I planted 38 deliberate bugs, security vulnerabilities, and code smells into a .NET 10 codebase — then let three AI reviewers loose on the same PR. Here's what happened. Why This Comparison? Every major platform now offers AI-powered code review: GitHub has Copilot, Cursor has BugBot, and Anthropic has Claude. They all claim to catch security issues, bugs, and code quality problems. But marketing aside, I wanted answers to three practical questions: How many issues does each tool actually catch? Not in a curated demo — in a realistic PR with a mix of critical vulnerabilities and subtle code smells. How do they behave across multiple review cycles? A first pass is one thing. What happens when you fix the findings and re-request a review? What's the developer experience like? Detection rate is a number. But does the tool actually help you ship with confidence? To find out, I designed a contr
Continue reading on Dev.to
Opens in a new tab


