FlareStart
HomeNewsHow ToSources
FlareStart

Where developers start their day. All the tech news & tutorials that matter, in one place.

Quick Links

  • Home
  • News
  • Tutorials
  • Sources
  • Privacy Policy

Connect

© 2026 FlareStart. All rights reserved.

Back to articles
NexusQuant benchmarks: every number, honestly
How-ToMachine Learning

NexusQuant benchmarks: every number, honestly

via Dev.toJoão André Gomes Marques3h ago

When you build a KV cache compression system and plan to publish a paper, you face a choice: present the best-looking numbers, or present all of them. We chose all of them. This post is every benchmark result we have, including the ones that did not work. The pipeline Quick context. NexusQuant compresses the KV cache of transformer models at inference time, training-free: Prefill → Key-Key Attention Score → Evict → RoPE-remove → Hadamard → 2-bit E8 VQ → Temporal Delta → zstd The context manager API: with nexusquant_evict ( model , quality = " balanced " ): output = model . generate ( input_ids , max_new_tokens = 200 ) All numbers below are from an A10G GPU (24 GB). Perplexity delta is measured against the uncompressed baseline on the same passages. Mistral-7B: the full picture These are our numbers at different prefix lengths and eviction rates. Every row is real. Prefix Evict% Compression PPL Delta Verdict 500 tok 35% 10.1x +0.90% Usable for most tasks 1664 tok 35% 10.4x +0.14% Near-l

Continue reading on Dev.to

Opens in a new tab

Read Full Article
0 views

Related Articles

How-To

Logos Privacy Builders Bootcamp

Reddit Programming • 1h ago

#05 Frozen Pipes
How-To

#05 Frozen Pipes

Dev.to • 6h ago

Replace Doom Scrolling With Intentional Reading
How-To

Replace Doom Scrolling With Intentional Reading

Dev.to • 9h ago

Web Color "Wheel" Chart
How-To

Web Color "Wheel" Chart

Dev.to • 13h ago

Im looking for indie apps and tools built by solo developers, their stories and perspectives for a newsletter I’m starting. If you know a solo maker or use an overlooked gem built by one please let me know! 🙏
How-To

Im looking for indie apps and tools built by solo developers, their stories and perspectives for a newsletter I’m starting. If you know a solo maker or use an overlooked gem built by one please let me know! 🙏

Dev.to • 1d ago

Discover More Articles