AI •

Nvidia Drops Nemotron 3 Nano Omni, a Multimodal Beast

Nvidia's latest open AI model handles text, vision, and speech in one package with a hybrid architecture.

Nvidia just unleashed Nemotron 3 Nano Omni, an open multimodal AI model that rolls text, vision, and speech into a single unified system. It's built on a 30B-A3B hybrid Mixture of Experts architecture — meaning it packs 30 billion parameters but only activates 3 billion at inference time. Efficient and powerful.

The model is designed for reasoning tasks across multiple modalities, which puts it squarely in the race to build AI that can see, read, and listen simultaneously.

The broader Nemotron 3 family has clearly found an audience. Nvidia says the lineup has racked up more than 50 million downloads over the past year. That's serious traction for an open model ecosystem.

By going open, Nvidia keeps feeding the developer community while reinforcing its grip on the AI infrastructure stack. The GPU giant isn't just selling picks and shovels anymore — it's handing out blueprints too.