Question 1

How much does GPT-5.5 cost per 1M tokens in May 2026?

Accepted Answer

List price is $5 input and $30 output per 1M tokens. Batch tier is 50% off at $2.50/$15. Priority tier carries a +50% surcharge ($7.50/$45) for reserved low-latency capacity. The high-compute GPT-5.5 Pro variant is $30/$180 — 6x list.

Question 2

What is the GPT-5.5 priority tier and is it worth +50%?

Accepted Answer

Priority tier reserves capacity for latency-sensitive workloads — voice, real-time agents, and interactive UX where TTFT and tail latency matter. The +50% surcharge ($7.50 input / $45 output per 1M) is hard to justify for asynchronous or batch-tolerant work, but pays back quickly when a 2-second latency win drives user retention.

Question 3

Should I use GPT-5.5 or GPT-5.5 Pro?

Accepted Answer

Default to GPT-5.5. The Pro variant is 6x the price ($30/$180 vs $5/$30) for a mid-teens AAII uplift, mostly visible on the hardest pure-reasoning work. For coding, agents, RAG, and chat — base GPT-5.5 is the right choice. Reserve Pro for problems where a wrong answer costs more than $50 of compute.

Question 4

How does GPT-5.5 compare to Claude Opus 4.7 on cost?

Accepted Answer

Same input rate ($5 per 1M) but GPT-5.5 charges $30 output vs Opus 4.7 at $25 — about 20% more on output. Opus 4.7 leads the Coding Arena; GPT-5.5 has the stronger ecosystem (voice, vision pipelines, tool integrations). For raw price, Opus is slightly cheaper; for breadth of platform, GPT-5.5 wins.

Question 5

Does GPT-5.5 support prompt caching?

Accepted Answer

Yes. OpenAI offers automatic prompt caching for repeated prefixes greater than 1024 tokens. Cache hits are billed at 50% of input list, far less aggressive than Anthropic at 90% off. Real-world bill reduction tends to be 15-30% rather than 40-60%.

Tier	Input /1M	Output /1M	Notes
Standard (sync)	$5.00	$30.00	List price for the synchronous Responses / Chat Completions API.
Batch (24h SLA)	$2.50	$15.00	50% off list for asynchronous workloads via the OpenAI Batch API. Output cap at 32K tokens still applies.
Priority (low-latency)	$7.50	$45.00	+50% surcharge on list for latency-sensitive workloads with reserved capacity. Useful for voice and real-time UX where TTFT matters.
GPT-5.5 Pro (variant)	$30.00	$180.00	High-compute thinking variant. 6x the price for mid-teens AAII uplift on hardest reasoning. Most teams should ignore it for general use.

Task	In tok	Out tok	Standard	Batch	Priority
Short chat reply	800	200	$0.0100	$0.0050	$0.0150
Long-doc summary (50-page PDF)	80,000	1,500	$0.4450	$0.2225	$0.6675
Agentic loop (12 tool turns)	45,000	6,000	$0.4050	$0.2025	$0.6075
RAG query (10-doc context)	12,000	600	$0.0780	$0.0390	$0.1170

Model	In /1M	Out /1M	Context	Note
GPT-5.5	$5.00	$30.00	1M	This page. AAII 59, Arena 1481. Strong all-rounder.
Claude Opus 4.7	$5.00	$25.00	1M	Same input, 17% cheaper output. Coding Arena #1 — better default for engineering work.
Gemini 3.1 Pro	$3.50	$10.50	2M	30% cheaper input, 65% cheaper output. Better for long-context and science.
DeepSeek V4 Pro	$1.74	$3.48	1M	~9x cheaper output. Apache 2.0 — also self-hostable.
GPT-5.5 Pro	$30.00	$180.00	1M	6x the price for mid-teens AAII uplift. Reserve for hardest reasoning only.

GPT-5.5 Cost & Pricing (May 2026)

Pricing tiers — every way to buy GPT-5.5

Cost per task — what you actually pay

GPT-5.5 vs nearest alternatives

When GPT-5.5 is worth $30 per 1M output tokens

When to switch to a cheaper alternative

Related