
Racing Against 2 Billion: How We Survived a PostgreSQL Transaction ID Wraparound on a 2TB Production Table
A detailed account of diagnosing and recovering from a near-fatal transaction ID wraparound on a high-traffic production PostgreSQL database — and the unexpected twist that turned a 60-second operation into 50 minutes of downtime. TL;DR : Our production PostgreSQL hit 1.6B transaction IDs (limit is ~2.1B) on a 2TB table. Normal vacuum failed for weeks due to excessive bloat. Vacuum freeze ran for 6 days and got stuck. We dropped 800GB of unused indexes and ran pg_repack as a last resort. pg_repack worked perfectly until the final step, where PostgreSQL's own anti-wraparound autovacuum blocked the ACCESS EXCLUSIVE lock needed to complete the swap — causing 50 minutes of downtime instead of 60 seconds. We killed autovacuum, pg_repack finished, XIDs dropped from 1.6B to 440M, and vacuum that used to take days now runs in 2.5 hours. Full timeline and lessons below. The Alert Nobody Worries About (Until They Should) On a quiet Tuesday morning, a monitoring alert fired: MaximumUsedTransactio
Continue reading on Dev.to DevOps
Opens in a new tab


