How much does GPT-4o cost per 1M tokens in May 2026?

List price is $2.50 input and $10 output per 1M tokens — held flat since October 2024. Cached input is 50% off ($1.25). Batch tier is 50% off ($1.25/$5). Vision input is billed at the same per-token rate. Stacking caching with batch yields $0.625 input / $5 output.

Is GPT-4o being deprecated?

Not formally deprecated as of May 2026, but GPT-5.5 (April 2026) is OpenAI's clear forward path. GPT-4o pricing has not been updated to compete with current frontier rates and OpenAI typically retires older flagships within 12-18 months of a successor. Plan migrations now and avoid building net-new products on GPT-4o.

When does GPT-4o still make sense?

Existing production workloads with stable performance characteristics where the migration cost to GPT-5.5 outweighs the quality gain. Vision pipelines that already work — GPT-4o vision is mature and well-documented. Cost-sensitive multimodal workloads where Gemini 3.1 Pro is not a viable swap due to vendor constraints.

Should I use GPT-4o or GPT-4o mini?

GPT-4o mini ($0.15 / $0.60 per 1M) is roughly 17x cheaper and well-suited to classification, routing, simple extraction, and cascade tier filtering. GPT-4o is worth the premium for multi-step reasoning, coding, and any workload where output quality drives downstream cost. For new builds, evaluate GPT-5.5 first.

How does GPT-4o compare to GPT-5.5 on cost?

GPT-4o is half the price on input and one-third on output ($2.50/$10 vs $5/$30). The quality gap is substantial across all benchmarks — GPT-5.5 leads by ~7 quality-index points. Whether the cost savings on GPT-4o outweigh the quality gap depends entirely on the workload.

Updated May 6, 2026

GPT-4o Cost & Pricing (May 2026)

Per-1M-token rates for every GPT-4o tier — standard, cached, and batch. Cost-per-task estimates including a vision example, a comparison vs GPT-5.5 and the new frontier, and an honest read on the sunset trajectory.

$2.50 input / 1M$10.00 output / 1M128K contextSunset trajectory — plan migration

Pricing tiers — every way to buy GPT-4o

Tier	Input /1M	Output /1M	Notes
Standard (sync)	$2.500	$10.00	List price as of October 2024 — has held flat since. Vision input is included at the same per-token rate.
Cached input	$1.250	$10.00	50% off list on cached input via OpenAI automatic prompt caching (prefix > 1024 tokens). Less aggressive than Anthropic at -90%.
Batch (24h SLA)	$1.250	$5.00	50% off list for asynchronous workloads via the OpenAI Batch API.
Cached + Batch	$0.625	$5.00	Stacked discount. Cheapest way to run GPT-4o for repeatable async work — but check whether GPT-5.5 batch is a better choice given the quality gap.

All prices in USD per 1M tokens. Vision input is billed at the same per-token rate after image-to-token conversion (a 1024x1024 image is ~765 tokens).

Cost per task — what you actually pay

Task	In tok	Out tok	Standard	Cached	Batch
Short chat reply	800	200	$0.0040	$0.0030	$0.0020
Long-doc summary (50-page PDF)	80,000	1,500	$0.2150	$0.1150	$0.1075
Vision: describe 4 images	4,000	800	$0.0180	$0.0130	$0.0090
RAG query (10-doc context)	12,000	600	$0.0360	$0.0210	$0.0180

Cost per single invocation. The vision row counts ~1000 tokens per image at 1024px detail level — adjust upward for high-detail mode.

GPT-4o vs nearest alternatives

Model	In /1M	Out /1M	Context	Note
GPT-4o	$2.50	$10.00	128K	This page. Vision included. Will be sunset eventually as GPT-5.5 matures.
GPT-5.5	$5.00	$30.00	1M	2x input, 3x output. Materially better quality across all benchmarks. The forward path.
Claude Opus 4.7	$5.00	$25.00	1M	2x input, 2.5x output. Coding Arena #1 — the better default for engineering.
Gemini 3.1 Pro	$3.50	$10.50	2M	40% more input, ~5% more output. 2M context, multimodal. Better successor for vision workloads.
GPT-4o mini	$0.15	$0.60	128K	~17x cheaper. Use for cascade tier and routing. Same family, much lower quality.

When GPT-4o is still worth running

Stable existing production. Workloads with tested prompt scaffolds and known quality characteristics where migration risk outweighs the quality lift from GPT-5.5.
Mature vision pipelines. The GPT-4o vision stack is well-documented and predictable. Migrating to a new vision model is rarely free.
Mid-tier cost target. $2.50/$10 sits between the cheap tier (DeepSeek Flash, GPT-4o mini) and frontier ($5+). Useful when quality must beat the cheap tier but frontier pricing is unjustified.

When to switch to a cheaper or better alternative

GPT-5.5 ($5 / $30) — for any net-new build. Materially better quality, 1M context vs 128K, and the path forward as OpenAI shifts focus away from GPT-4o.
Gemini 3.1 Pro ($3.50 / $10.50) — better successor for vision workloads. 2M context, native multimodal, only 5% more expensive on output.
GPT-4o mini ($0.15 / $0.60) — for routing, classification, and cascade tier work where GPT-4o quality is wasted.
DeepSeek V4 Pro ($1.74 / $3.48) — when cost dominates and the workload is not vision-heavy.

AI Model Leaderboard — quality vs price across all providers
Per Million Tokens True Cost — hidden adders pushing bills 1.5-3x above list
Token Cost Calculator — interactive estimator
GPT-4o deep-dive — full benchmarks and architecture

Teams running GPT-4o alongside other providers typically front the API with Swfte Connect to route across these models behind one OpenAI-compatible surface with prompt caching and per-route fallback — useful when planning a phased migration to GPT-5.5 or Gemini 3.1 Pro.

Sources: official OpenAI pricing page, May 2026-05-06. Pricing reflects October 2024 reduction held flat through May 2026.