Question 1

How much does Gemini 3.1 Pro cost per 1M tokens in May 2026?

Accepted Answer

List price is $3.50 input and $10.50 output per 1M tokens. The same rate applies across the full 2M-token context window. Cached input is 75% off ($0.875). Batch tier is 50% off ($1.75/$5.25). Stacking caching on batch yields $0.4375 input / $5.25 output.

Question 2

Does Gemini 3.1 Pro charge extra for long-context queries?

Accepted Answer

No. The per-1M-token rate is flat across the full 2M context window. This is the largest practical advantage over GPT-5.5 (1M cap) and Claude Opus 4.7 (1M cap) for genuinely long-context workloads — book-length summarization, repository-wide analysis, and full-archive retrieval.

Question 3

How does Gemini context caching pricing work?

Accepted Answer

Gemini context caching gives 75% off list on cached input — less aggressive than Anthropic at 90% off, but better than OpenAI at 50% off. Storage carries a separate fee of roughly $1 per 1M tokens per hour, so the break-even on caching is shorter sessions where the cached content is reused multiple times within a few hours.

Question 4

When is Gemini 3.1 Pro the right choice?

Accepted Answer

Long-context workloads (above 1M tokens), scientific reasoning (GPQA Diamond 94.3% leads), multimodal pipelines mixing image, audio, and text, and any case where input cost dominates the bill. The text Arena Elo of ~1500 is competitive with Opus 4.7 and GPT-5.5.

Question 5

Is Gemini 3.1 Pro cheaper than Claude Opus 4.7 for coding?

Accepted Answer

Cheaper, yes — 30% on input and 58% on output. But Opus 4.7 leads the Coding Arena at 1567 Elo with a meaningful gap on SWE-bench. For pure engineering workloads, Opus is the better default; Gemini 3.1 Pro shines on science, long-context, and multimodal — and on cost per task when input tokens dominate.

Tier	Input /1M	Output /1M	Notes
Standard (sync)	$3.5000	$10.50	List price for the synchronous API. Same rate up to the full 2M-token context window — no long-context surcharge.
Cached input	$0.8750	$10.50	75% off list on cached input via Gemini context caching. Storage fee applies separately ($1 per 1M tokens per hour).
Batch (24h SLA)	$1.7500	$5.25	50% off list for asynchronous workloads. Stackable with caching.
Cached + Batch	$0.4375	$5.25	Stacked discount. The cheapest way to run Gemini 3.1 Pro for repeatable async work — useful for nightly enrichment and document pipelines.

Task	In tok	Out tok	Standard	Cached	Batch
Short chat reply	800	200	$0.0049	$0.0028	$0.0024
Long-doc summary (500-page book)	800,000	3,000	$2.8315	$0.7315	$1.4158
Agentic loop (12 tool turns)	45,000	6,000	$0.2205	$0.1024	$0.1103
RAG query (10-doc context)	12,000	600	$0.0483	$0.0168	$0.0242

Model	In /1M	Out /1M	Context	Note
Gemini 3.1 Pro	$3.50	$10.50	2M	This page. Text Arena leader at ~1500 Elo. GPQA Diamond 94.3%.
Claude Opus 4.7	$5.00	$25.00	1M	43% more on input, 138% more on output. Coding Arena #1 — better for engineering work.
GPT-5.5	$5.00	$30.00	1M	43% more on input, 186% more on output. Stronger ecosystem (voice, vision).
DeepSeek V4 Pro	$1.74	$3.48	1M	50% cheaper input, 67% cheaper output. Apache 2.0 — also self-hostable.

Gemini 3.1 Pro Cost & Pricing (May 2026)

Pricing tiers — every way to buy Gemini 3.1 Pro

Cost per task — what you actually pay

Gemini 3.1 Pro vs nearest alternatives

When Gemini 3.1 Pro is worth the price

When to switch to a cheaper alternative

Related