What I Learned Testing 12 Compression Approaches That Failed

What I Learned Testing 12 Compression Approaches That Failed The most useful research I've done this year isn't in the NexusQuant paper. It's the experiments that failed, the ideas that sounded smart in theory and didn't survive contact with real KV cache data. Negative results build trust. They also save time — if you're working on KV cache compression, this list might save you weeks of effort. Each entry: what we tried, what we expected, what happened, what we learned. 1. PCA Rotation (3x Worse Distortion) The idea: Apply PCA to KV vectors to align the quantization axes with the data's principal components. This is optimal for Gaussian data — principal components diagonalize the covariance matrix, and uniform quantization along them minimizes MSE. What happened: 3x worse distortion than Hadamard rotation. PPL degradation jumped ~0.9 percentage points at the same compression ratio. Why: PCA is computed per-layer from calibration data. KV distributions shift by layer depth, head index,

What I Learned Testing 12 Compression Approaches That Failed

Related Articles

#05 Frozen Pipes

Replace Doom Scrolling With Intentional Reading

Web Color "Wheel" Chart

Im looking for indie apps and tools built by solo developers, their stories and perspectives for a newsletter I’m starting. If you know a solo maker or use an overlooked gem built by one please let me know! 🙏

Building a DIY OpenClaw

Related Articles

How-To
#05 Frozen Pipes
Dev.to • 6h ago

How-To
Replace Doom Scrolling With Intentional Reading
Dev.to • 9h ago

How-To
Web Color "Wheel" Chart
Dev.to • 13h ago

How-To
Im looking for indie apps and tools built by solo developers, their stories and perspectives for a newsletter I’m starting. If you know a solo maker or use an overlooked gem built by one please let me know! 🙏
Dev.to • 1d ago

How-To
Building a DIY OpenClaw
Lobsters • 1d ago