Google's TurboQuant AI-compression algorithm can reduce LLM memory usage by 6x

(arstechnica.com)

16 points | by gmays 7 hours ago ago

2 comments

$redanddead 5 hours ago

You'd think it'd be bigger news on hn
[-]