Updated May 6, 2026

Gemini 3.1 Pro Cost & Pricing (May 2026)

Per-1M-token rates for every Gemini 3.1 Pro tier — standard, cached, and batch. Cost-per-task estimates for typical workloads (including a 500-page book) and a like-for-like comparison vs Claude Opus 4.7, GPT-5.5, and DeepSeek V4 Pro.

$3.50 input / 1M$10.50 output / 1M2M contextLargest frontier context

Pricing tiers — every way to buy Gemini 3.1 Pro

TierInput /1MOutput /1MNotes
Standard (sync)$3.5000$10.50List price for the synchronous API. Same rate up to the full 2M-token context window — no long-context surcharge.
Cached input$0.8750$10.5075% off list on cached input via Gemini context caching. Storage fee applies separately ($1 per 1M tokens per hour).
Batch (24h SLA)$1.7500$5.2550% off list for asynchronous workloads. Stackable with caching.
Cached + Batch$0.4375$5.25Stacked discount. The cheapest way to run Gemini 3.1 Pro for repeatable async work — useful for nightly enrichment and document pipelines.

All prices in USD per 1M tokens. The standout fact: pricing is flat across the full 2M context window — no long-context surcharge as of May 2026.

Cost per task — what you actually pay

TaskIn tokOut tokStandardCachedBatch
Short chat reply800200$0.0049$0.0028$0.0024
Long-doc summary (500-page book)800,0003,000$2.8315$0.7315$1.4158
Agentic loop (12 tool turns)45,0006,000$0.2205$0.1024$0.1103
RAG query (10-doc context)12,000600$0.0483$0.0168$0.0242

Cost per single invocation. The 500-page book row shows the long-context advantage clearly — under $3 for a full book on standard tier, under $0.75 cached.

Gemini 3.1 Pro vs nearest alternatives

ModelIn /1MOut /1MContextNote
Gemini 3.1 Pro$3.50$10.502MThis page. Text Arena leader at ~1500 Elo. GPQA Diamond 94.3%.
Claude Opus 4.7$5.00$25.001M43% more on input, 138% more on output. Coding Arena #1 — better for engineering work.
GPT-5.5$5.00$30.001M43% more on input, 186% more on output. Stronger ecosystem (voice, vision).
DeepSeek V4 Pro$1.74$3.481M50% cheaper input, 67% cheaper output. Apache 2.0 — also self-hostable.

When Gemini 3.1 Pro is worth the price

  • Long-context workloads. 2M tokens with flat pricing is unmatched. Book-length summaries, full repository analysis, multi-document synthesis.
  • Scientific reasoning. GPQA Diamond 94.3% leads the field — useful for research, technical literature review, and graduate-level reasoning tasks.
  • Multimodal pipelines. Native handling of image, audio, video, and text in one model with consistent pricing.
  • Input-heavy workloads. When input dominates the bill, $3.50 per 1M is materially cheaper than competitors.

When to switch to a cheaper alternative

  • DeepSeek V4 Pro ($1.74 / $3.48) — 50% cheaper input, 67% cheaper output. Apache 2.0 also enables self-host. Quality gap is small for most agentic and chat work.
  • Gemini 2.5 Flash or DeepSeek V4 Flash — for classification, routing, and high-throughput simple tasks where Pro-tier quality is wasted.
  • Claude Opus 4.7 — when coding and agentic quality matter more than per-token cost. Coding Arena #1.

Related

Teams running Gemini 3.1 Pro alongside other providers typically front the API with Swfte Connect to route across these models behind one OpenAI-compatible surface with prompt caching and per-route fallback.

Sources: official Google AI / Vertex AI pricing pages, May 2026-05-06.