Accelerating Gemma 4: faster inference with multi-token prediction drafters

(blog.google)

592 points | by amrrs 20 hours ago ago

278 comments