Google Drops DiffusionGemma: Text Diffusion Takes On Autoregressive AI
Google's new 26B-parameter open model uses text diffusion to achieve up to 4x faster inference than traditional approaches.
Google just released DiffusionGemma, an experimental open model that ditches the standard autoregressive approach to text generation in favor of text diffusion. The result? Up to 4x faster inference when running on dedicated GPUs.
The model packs 26 billion parameters and is designed to unlock speed-critical, interactive local workflows. That's a meaningful shift — most large language models generate text one token at a time. Diffusion-based generation works differently, potentially producing text in parallel chunks.
DiffusionGemma is open and experimental, meaning researchers and developers can dig into it immediately. Google is positioning it as an exploration tool rather than a production-ready system.
The big picture: if diffusion-based text generation proves viable at scale, it could fundamentally change how fast AI models respond — especially for local, on-device applications where latency matters most.