Searchable JSON compression: page-level random access + ms lookups (and smaller than Zstd on our dataset)

via Dev.tokodomonocch11mo ago

Searchable JSON compression with page-level random access (and smaller than Zstd on our dataset) Most JSON compression stories end at “make it smaller.” But in real systems, the bigger cost is often decompress + parse + scan — repeatedly. I built SEE (Semantic Entropy Encoding) : a searchable compression format for JSON/NDJSON that keeps data queryable while compressed , with page-level random access . On our dataset, SEE is smaller than Zstd and supports fast lookups (details + proof below). Why this matters: the hidden “decompress+parse tax” If you store NDJSON as zstd , most queries still pay: read large chunks decompress everything parse JSON scan for the field/value you need Even if the data is small, the CPU + I/O pattern is brutal at scale. SEE targets workloads where you repeatedly need: exists / pos / eq -style queries random access low latency without full decompression What SEE is (in 60 seconds) SEE is a page-based , schema-aware format: page-level layout for random access

Continue reading on Dev.to

Opens in a new tab

Read Full Article

52 views

Searchable JSON compression: page-level random access + ms lookups (and smaller than Zstd on our dataset)

Related Articles

What I learned about X-HEEP by Benchmarking

No more Chinese Polestar 3s as production shifts entirely to the US

The most important 40 mcq with its answers How to use Android visual studio to make a mobile app

What is Agent Script? How to Build Agents with It in Agentforce

I Coded 3 Famous Trading Strategies in Pine Script and Backtested All of Them. None Passed.