Back to articles
Scaling pgvector: Memory, Quantization, and Index Build Strategies
How-ToSystems

Scaling pgvector: Memory, Quantization, and Index Build Strategies

via Dev.toPhilip McClarence

Scaling pgvector: Memory, Quantization, and Index Build Strategies pgvector handles small-scale vector search effortlessly. A few hundred thousand embeddings with an HNSW index, and similarity queries return in milliseconds. But once you push past a million vectors, three problems converge -- and if you haven't planned for them, they hit at the same time. Three Walls at Scale Wall 1: HNSW Index Builds Need Massive Memory Building an HNSW index requires holding the entire graph in memory during construction. If maintenance_work_mem is too low (the default is 64 MB), PostgreSQL falls back to a disk-based build that runs 10-50x slower. For 5 million vectors at 1536 dimensions, you may need 8-16 GB of working memory. Most teams discover this during the build, after waiting hours with no progress indication. Wall 2: Full-Precision Vectors Eat Storage Each dimension in a vector column is stored as a 4-byte float32. A 1536-dimension embedding takes ~6 KB per row. At 10 million rows, that's 60

Continue reading on Dev.to

Opens in a new tab

Read Full Article
2 views

Related Articles