I built a managed pgvector service and here's what I learned about vector search performance

I Ran pgvector on NVMe vs Cloud SSD. The Difference Shocked Me. 2,000 queries per second at under 4ms. That's what I'm getting on a $35/month server. Let me tell you how I got there and what I had to build to make it work. The problem started with a side project I was building a RAG pipeline. Standard stuff: OpenAI embeddings, PostgreSQL, pgvector extension, HNSW index. Everything worked fine in development. Then I moved it to a managed database on a regular cloud provider and watched my query latency go from 4ms to 47ms under any real load. I spent two days thinking my index was wrong. Wrong ef_search value. Wrong m parameter. Wrong dimension count. None of that was it. The problem was the disk. HNSW does not behave like a normal database query Most database queries are sequential reads. Your disk reads a chunk of data in order, hands it back, done. SSDs are fast at this. Even gp3 cloud SSDs are fast at this. HNSW is different. An HNSW index traversal is essentially a graph walk. You

I built a managed pgvector service and here's what I learned about vector search performance

Related Articles

IntentCAD v0.8.0 — Thirteen EPICs, One Day

A Growing Position Doesn't Always Mean Fresh Buying — Here's How to Tell

Tutorials Are Lying to You Here’s What Actually Works ?

Flutter Mistakes That Make Apps Slow ⚡

Welcome Thread - v370

Related Articles

How-To
IntentCAD v0.8.0 — Thirteen EPICs, One Day
Dev.to • 3h ago

How-To
A Growing Position Doesn't Always Mean Fresh Buying — Here's How to Tell
Dev.to Beginners • 4h ago

How-To
Tutorials Are Lying to You Here’s What Actually Works ?
Medium Programming • 7h ago

How-To
Flutter Mistakes That Make Apps Slow ⚡
Medium Programming • 7h ago

How-To
Welcome Thread - v370
Dev.to • 7h ago