Back to articles
Agentic AI Code Review: From Confidently Wrong to Evidence-Based

Agentic AI Code Review: From Confidently Wrong to Evidence-Based

via Dev.toAlexandre Amado de Castro

Archbot flagged a "blocker" on a PR. It cited the diff, built a plausible chain of reasoning, and suggested a fix. It was completely wrong. Not "LLMs are sometimes wrong" wrong — more like convincing enough that a senior engineer spent 20 minutes disproving it . The missing detail wasn't subtle. It was a guard clause sitting in a helper two files away. Archbot just didn't have that file. That failure mode wasn't a prompt problem. It was a context problem . So I stopped trying to predict what context the model would need up-front, and switched to an agentic loop: give the model tools to fetch evidence as it goes, and require it to end with a structured "submit review" action. This post is the architectural why and how (and the reliability plumbing that made it work). There are good hosted AI review tools now. This post is about the pattern underneath them. [!NOTE] Names, repos, and examples are intentionally generalized. This is about the design patterns, not a particular company. If yo

Continue reading on Dev.to

Opens in a new tab

Read Full Article
2 views

Related Articles