
Debugging PostgreSQL Query Plan Instability in Production
The query plan is only as good as the statistics behind it. When those statistics are wrong, the planner makes confident decisions based on a false reality. We run a field technician dispatch system on PostgreSQL 14. The core query — find technicians matching specific dispatch criteria near a job site — ran in 3ms at 10am and 42ms at 2pm the same day. Same query text. Same bind parameters. The only thing that changed was the number of technicians available — as the fleet clocked in and dispatched through the day, the underlying data distribution shifted just enough to flip the query plan. This post is the story of how we traced a 14x latency regression to a fundamental assumption baked into every cost-based query optimizer, and why our existing composite index didn't help. Along the way, we'll dig into pg_statistic , selectivity estimation, BitmapAnd mechanics, and the specific ways correlated boolean columns break the planner's world model. But first — if you've never looked at how Po
Continue reading on Dev.to
Opens in a new tab



