"TurboQuant proved it can quantize the key-value cache to just 3 bits without re... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		macleginn 14 days ago \| parent \| context \| favorite \| on: TurboQuant: Redefining AI efficiency with extreme ... "TurboQuant proved it can quantize the key-value cache to just 3 bits without requiring training or fine-tuning and causing any compromise in model accuracy" -- what do each 3 bits correspond to? Hardly individual keys or values, since it would limit each of them to 8 different vectors.

carlosvega 14 days ago | [–]

Is the number of bits per coordinate. So, 1 bit is 2x2 grid. 3 bit is a 64 cell grid (2^3 x 2^3). Here you have a demo.

https://mesuvash.github.io/blog/2026/turboquant-interactive/

jbellis 14 days ago | [–]

The explanation is terrible, but it's clear that it's not actually lossless.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact