
We Thought Our AI Reviews Were 98.6% Valid. Independent Validation Said 69%.
The most dangerous thing about AI-augmented work isn't the errors. It's thinking you're not making them. I ran 449 AI-assisted code reviews on OCA (Odoo Community Association) open source PRs in 9 days. When I had the AI assess its own review quality, it said 98.6% valid. When I ran independent validation, the number dropped to 68.9%. The validation used 40 separate AI instances, each reading the actual code diffs and verifying every technical claim. That 30-point gap should concern anyone using AI for serious work. The experiment Between February 24 and March 4, 2026, I reviewed 449 unique pull requests across 6 OCA repositories using AI-assisted workflows. Each PR got a full technical review: architecture assessment, bug identification, security analysis, test coverage evaluation. The output was structured code review comments posted directly to GitHub. For scale: OCA's most prolific human reviewer has done 2,197 unique PR reviews over 9.5 years. My campaign produced 449 in 9 days. T
Continue reading on Dev.to
Opens in a new tab




