Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

For my grug brain can somebody translate this to ELIgrug terms?

Does this mean I would be able to run 500b model on my 48gb macbook without loosing quality?

 help



KV cache compression, so how much memory the model needs to use for extending its context. Does not affect the weight size.

I wrote this more intuitive explanation. I think you might find it helpful!

https://prabal.ca/posts/google-long-context-cheaper/




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: