
How I built a tool that uses git history to find the files most likely to break your codebase
Every codebase has files that look fine but will take down prod if you breathe on them wrong. The problem is there's no obvious marker on them. No comment that says "this couples to six other things." No warning in the code review checklist. You only find out after the incident. I wanted to build something that surfaces these files automatically. The result is fearmap, which mines git history to classify every file as LOAD-BEARING, RISKY, DEAD, or SAFE. This post is about the methodology and some things I learned building it. Why git history instead of static analysis The obvious approach is static analysis. Parse the imports, build a dependency graph, flag the highly connected nodes. It works, but it misses a lot. Static analysis tells you about declared dependencies. It doesn't tell you about the hidden ones. Two files that always get edited together in the same commit are coupled, even if nothing in the code makes that explicit. That coupling lives in developer behaviour, not in the
Continue reading on Dev.to Python
Opens in a new tab




