How much does Nemotron 3 Nano Omni cost?

Nemotron 3 Nano Omni is published as open weights (NVIDIA Open Model License) — there is no per-token API price. You self-host the model and pay infrastructure cost (GPU rental and operating overhead). Effective cost on commodity 24-48GB GPUs typically lands at $0.10-$0.50 per million output tokens.

What is the context window of Nemotron 3 Nano Omni?

Nemotron 3 Nano Omni supports a context window of 256K tokens (256,000 tokens) and can generate up to 16K output tokens per request.

What is Nemotron 3 Nano Omni best for?

Nemotron 3 Nano Omni is best for: Open multimodal. It supports chat, vision, audio, code capabilities. It is open-source under the NVIDIA Open Model License license.

Nemotron 3 Nano Omni — Pricing, Benchmarks & Specs

Nemotron 3 Nano Omni

Mistral AIopen-sourceOpen Source

NVIDIA's 30B open multimodal model running vision + audio + text in a single stack. Tops 6 specialty leaderboards. April 2026.

Context Window

256K

tokens

Max Output

16K

tokens

Input Price

Self-host

open weights

Output Price

Self-host

infra cost only

Speed

158

tokens/sec

Released

Apr 2026

2026-04-11

Blended Cost

—

self-host

Value Score

—

n/a self-host

Capabilities

ChatVisionAudioCode Generation

Benchmarks

Quality Index

MMLU Pro

81.8

HumanEval (Coding)

80.6

MATH

75.4

Arena ELO

1361

Try in Playground Use via Swfte Connect Mistral AI Docs

Compare With

Open Source — Licensed under NVIDIA Open Model License

About Nemotron 3 Nano Omni

Nemotron 3 Nano Omni is a open-source AI model by Mistral AI, released on April 11, 2026. It supports a context window of 256K tokens and can generate up to 16K output tokens.

Nemotron 3 Nano Omni is published as open weights (NVIDIA Open Model License) for self-hosting — there is no per-token API price. Cost depends on your inference infrastructure: GPU rental, throughput per GPU, and operating overhead. For commodity 24-48GB GPUs the effective cost typically lands in the $0.10-$0.50 per million output tokens range, well below the cheapest hosted alternatives.

Nemotron 3 Nano Omni is available as an open-source model under the NVIDIA Open Model License license, meaning you can self-host it for predictable costs or use it through API providers like Swfte Connect.

Using Nemotron 3 Nano Omni with Swfte

Access Nemotron 3 Nano Omni through Swfte Connect, our unified LLM gateway. Connect gives you a single API for 50+ models, with automatic routing, cost optimization, and fallback handling. You can also try Nemotron 3 Nano Omni in our AI Playground before integrating.