Nemotron 3 Nano Omni
NVIDIA's 30B open multimodal model running vision + audio + text in a single stack. Tops 6 specialty leaderboards. April 2026.
256K
tokens
16K
tokens
Self-host
open weights
Self-host
infra cost only
158
tokens/sec
Apr 2026
2026-04-11
—
self-host
—
n/a self-host
Capabilities
Benchmarks
Compare With
Open Source — Licensed under NVIDIA Open Model License
About Nemotron 3 Nano Omni
Nemotron 3 Nano Omni is a open-source AI model by Mistral AI, released on April 11, 2026. It supports a context window of 256K tokens and can generate up to 16K output tokens.
Nemotron 3 Nano Omni is published as open weights (NVIDIA Open Model License) for self-hosting — there is no per-token API price. Cost depends on your inference infrastructure: GPU rental, throughput per GPU, and operating overhead. For commodity 24-48GB GPUs the effective cost typically lands in the $0.10-$0.50 per million output tokens range, well below the cheapest hosted alternatives.
Nemotron 3 Nano Omni is available as an open-source model under the NVIDIA Open Model License license, meaning you can self-host it for predictable costs or use it through API providers like Swfte Connect.
Using Nemotron 3 Nano Omni with Swfte
Access Nemotron 3 Nano Omni through Swfte Connect, our unified LLM gateway. Connect gives you a single API for 50+ models, with automatic routing, cost optimization, and fallback handling. You can also try Nemotron 3 Nano Omni in our AI Playground before integrating.