Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

"TurboQuant proved it can quantize the key-value cache to just 3 bits without requiring training or fine-tuning and causing any compromise in model accuracy" -- what do each 3 bits correspond to? Hardly individual keys or values, since it would limit each of them to 8 different vectors.


Is the number of bits per coordinate. So, 1 bit is 2x2 grid. 3 bit is a 64 cell grid (2^3 x 2^3). Here you have a demo.

https://mesuvash.github.io/blog/2026/turboquant-interactive/


The explanation is terrible, but it's clear that it's not actually lossless.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: