Updated May 6, 2026

GPT-4o Cost & Pricing (May 2026)

Per-1M-token rates for every GPT-4o tier — standard, cached, and batch. Cost-per-task estimates including a vision example, a comparison vs GPT-5.5 and the new frontier, and an honest read on the sunset trajectory.

$2.50 input / 1M$10.00 output / 1M128K contextSunset trajectory — plan migration

Pricing tiers — every way to buy GPT-4o

TierInput /1MOutput /1MNotes
Standard (sync)$2.500$10.00List price as of October 2024 — has held flat since. Vision input is included at the same per-token rate.
Cached input$1.250$10.0050% off list on cached input via OpenAI automatic prompt caching (prefix > 1024 tokens). Less aggressive than Anthropic at -90%.
Batch (24h SLA)$1.250$5.0050% off list for asynchronous workloads via the OpenAI Batch API.
Cached + Batch$0.625$5.00Stacked discount. Cheapest way to run GPT-4o for repeatable async work — but check whether GPT-5.5 batch is a better choice given the quality gap.

All prices in USD per 1M tokens. Vision input is billed at the same per-token rate after image-to-token conversion (a 1024x1024 image is ~765 tokens).

Cost per task — what you actually pay

TaskIn tokOut tokStandardCachedBatch
Short chat reply800200$0.0040$0.0030$0.0020
Long-doc summary (50-page PDF)80,0001,500$0.2150$0.1150$0.1075
Vision: describe 4 images4,000800$0.0180$0.0130$0.0090
RAG query (10-doc context)12,000600$0.0360$0.0210$0.0180

Cost per single invocation. The vision row counts ~1000 tokens per image at 1024px detail level — adjust upward for high-detail mode.

GPT-4o vs nearest alternatives

ModelIn /1MOut /1MContextNote
GPT-4o$2.50$10.00128KThis page. Vision included. Will be sunset eventually as GPT-5.5 matures.
GPT-5.5$5.00$30.001M2x input, 3x output. Materially better quality across all benchmarks. The forward path.
Claude Opus 4.7$5.00$25.001M2x input, 2.5x output. Coding Arena #1 — the better default for engineering.
Gemini 3.1 Pro$3.50$10.502M40% more input, ~5% more output. 2M context, multimodal. Better successor for vision workloads.
GPT-4o mini$0.15$0.60128K~17x cheaper. Use for cascade tier and routing. Same family, much lower quality.

When GPT-4o is still worth running

  • Stable existing production. Workloads with tested prompt scaffolds and known quality characteristics where migration risk outweighs the quality lift from GPT-5.5.
  • Mature vision pipelines. The GPT-4o vision stack is well-documented and predictable. Migrating to a new vision model is rarely free.
  • Mid-tier cost target. $2.50/$10 sits between the cheap tier (DeepSeek Flash, GPT-4o mini) and frontier ($5+). Useful when quality must beat the cheap tier but frontier pricing is unjustified.

When to switch to a cheaper or better alternative

  • GPT-5.5 ($5 / $30) — for any net-new build. Materially better quality, 1M context vs 128K, and the path forward as OpenAI shifts focus away from GPT-4o.
  • Gemini 3.1 Pro ($3.50 / $10.50) — better successor for vision workloads. 2M context, native multimodal, only 5% more expensive on output.
  • GPT-4o mini ($0.15 / $0.60) — for routing, classification, and cascade tier work where GPT-4o quality is wasted.
  • DeepSeek V4 Pro ($1.74 / $3.48) — when cost dominates and the workload is not vision-heavy.

Related

Teams running GPT-4o alongside other providers typically front the API with Swfte Connect to route across these models behind one OpenAI-compatible surface with prompt caching and per-route fallback — useful when planning a phased migration to GPT-5.5 or Gemini 3.1 Pro.

Sources: official OpenAI pricing page, May 2026-05-06. Pricing reflects October 2024 reduction held flat through May 2026.