Gemini 3.1 Pro Cost & Pricing (May 2026)
Per-1M-token rates for every Gemini 3.1 Pro tier — standard, cached, and batch. Cost-per-task estimates for typical workloads (including a 500-page book) and a like-for-like comparison vs Claude Opus 4.7, GPT-5.5, and DeepSeek V4 Pro.
Pricing tiers — every way to buy Gemini 3.1 Pro
| Tier | Input /1M | Output /1M | Notes |
|---|---|---|---|
| Standard (sync) | $3.5000 | $10.50 | List price for the synchronous API. Same rate up to the full 2M-token context window — no long-context surcharge. |
| Cached input | $0.8750 | $10.50 | 75% off list on cached input via Gemini context caching. Storage fee applies separately ($1 per 1M tokens per hour). |
| Batch (24h SLA) | $1.7500 | $5.25 | 50% off list for asynchronous workloads. Stackable with caching. |
| Cached + Batch | $0.4375 | $5.25 | Stacked discount. The cheapest way to run Gemini 3.1 Pro for repeatable async work — useful for nightly enrichment and document pipelines. |
All prices in USD per 1M tokens. The standout fact: pricing is flat across the full 2M context window — no long-context surcharge as of May 2026.
Cost per task — what you actually pay
| Task | In tok | Out tok | Standard | Cached | Batch |
|---|---|---|---|---|---|
| Short chat reply | 800 | 200 | $0.0049 | $0.0028 | $0.0024 |
| Long-doc summary (500-page book) | 800,000 | 3,000 | $2.8315 | $0.7315 | $1.4158 |
| Agentic loop (12 tool turns) | 45,000 | 6,000 | $0.2205 | $0.1024 | $0.1103 |
| RAG query (10-doc context) | 12,000 | 600 | $0.0483 | $0.0168 | $0.0242 |
Cost per single invocation. The 500-page book row shows the long-context advantage clearly — under $3 for a full book on standard tier, under $0.75 cached.
Gemini 3.1 Pro vs nearest alternatives
| Model | In /1M | Out /1M | Context | Note |
|---|---|---|---|---|
| Gemini 3.1 Pro | $3.50 | $10.50 | 2M | This page. Text Arena leader at ~1500 Elo. GPQA Diamond 94.3%. |
| Claude Opus 4.7 | $5.00 | $25.00 | 1M | 43% more on input, 138% more on output. Coding Arena #1 — better for engineering work. |
| GPT-5.5 | $5.00 | $30.00 | 1M | 43% more on input, 186% more on output. Stronger ecosystem (voice, vision). |
| DeepSeek V4 Pro | $1.74 | $3.48 | 1M | 50% cheaper input, 67% cheaper output. Apache 2.0 — also self-hostable. |
When Gemini 3.1 Pro is worth the price
- Long-context workloads. 2M tokens with flat pricing is unmatched. Book-length summaries, full repository analysis, multi-document synthesis.
- Scientific reasoning. GPQA Diamond 94.3% leads the field — useful for research, technical literature review, and graduate-level reasoning tasks.
- Multimodal pipelines. Native handling of image, audio, video, and text in one model with consistent pricing.
- Input-heavy workloads. When input dominates the bill, $3.50 per 1M is materially cheaper than competitors.
When to switch to a cheaper alternative
- DeepSeek V4 Pro ($1.74 / $3.48) — 50% cheaper input, 67% cheaper output. Apache 2.0 also enables self-host. Quality gap is small for most agentic and chat work.
- Gemini 2.5 Flash or DeepSeek V4 Flash — for classification, routing, and high-throughput simple tasks where Pro-tier quality is wasted.
- Claude Opus 4.7 — when coding and agentic quality matter more than per-token cost. Coding Arena #1.
Related
- AI Model Leaderboard — quality vs price across all providers
- Per Million Tokens True Cost — hidden adders pushing bills 1.5-3x above list
- Token Cost Calculator — interactive estimator
- Gemini 3.1 Pro deep-dive — full benchmarks and architecture
Teams running Gemini 3.1 Pro alongside other providers typically front the API with Swfte Connect to route across these models behind one OpenAI-compatible surface with prompt caching and per-route fallback.
Sources: official Google AI / Vertex AI pricing pages, May 2026-05-06.