GPT-4o Cost & Pricing (May 2026)
Per-1M-token rates for every GPT-4o tier — standard, cached, and batch. Cost-per-task estimates including a vision example, a comparison vs GPT-5.5 and the new frontier, and an honest read on the sunset trajectory.
Pricing tiers — every way to buy GPT-4o
| Tier | Input /1M | Output /1M | Notes |
|---|---|---|---|
| Standard (sync) | $2.500 | $10.00 | List price as of October 2024 — has held flat since. Vision input is included at the same per-token rate. |
| Cached input | $1.250 | $10.00 | 50% off list on cached input via OpenAI automatic prompt caching (prefix > 1024 tokens). Less aggressive than Anthropic at -90%. |
| Batch (24h SLA) | $1.250 | $5.00 | 50% off list for asynchronous workloads via the OpenAI Batch API. |
| Cached + Batch | $0.625 | $5.00 | Stacked discount. Cheapest way to run GPT-4o for repeatable async work — but check whether GPT-5.5 batch is a better choice given the quality gap. |
All prices in USD per 1M tokens. Vision input is billed at the same per-token rate after image-to-token conversion (a 1024x1024 image is ~765 tokens).
Cost per task — what you actually pay
| Task | In tok | Out tok | Standard | Cached | Batch |
|---|---|---|---|---|---|
| Short chat reply | 800 | 200 | $0.0040 | $0.0030 | $0.0020 |
| Long-doc summary (50-page PDF) | 80,000 | 1,500 | $0.2150 | $0.1150 | $0.1075 |
| Vision: describe 4 images | 4,000 | 800 | $0.0180 | $0.0130 | $0.0090 |
| RAG query (10-doc context) | 12,000 | 600 | $0.0360 | $0.0210 | $0.0180 |
Cost per single invocation. The vision row counts ~1000 tokens per image at 1024px detail level — adjust upward for high-detail mode.
GPT-4o vs nearest alternatives
| Model | In /1M | Out /1M | Context | Note |
|---|---|---|---|---|
| GPT-4o | $2.50 | $10.00 | 128K | This page. Vision included. Will be sunset eventually as GPT-5.5 matures. |
| GPT-5.5 | $5.00 | $30.00 | 1M | 2x input, 3x output. Materially better quality across all benchmarks. The forward path. |
| Claude Opus 4.7 | $5.00 | $25.00 | 1M | 2x input, 2.5x output. Coding Arena #1 — the better default for engineering. |
| Gemini 3.1 Pro | $3.50 | $10.50 | 2M | 40% more input, ~5% more output. 2M context, multimodal. Better successor for vision workloads. |
| GPT-4o mini | $0.15 | $0.60 | 128K | ~17x cheaper. Use for cascade tier and routing. Same family, much lower quality. |
When GPT-4o is still worth running
- Stable existing production. Workloads with tested prompt scaffolds and known quality characteristics where migration risk outweighs the quality lift from GPT-5.5.
- Mature vision pipelines. The GPT-4o vision stack is well-documented and predictable. Migrating to a new vision model is rarely free.
- Mid-tier cost target. $2.50/$10 sits between the cheap tier (DeepSeek Flash, GPT-4o mini) and frontier ($5+). Useful when quality must beat the cheap tier but frontier pricing is unjustified.
When to switch to a cheaper or better alternative
- GPT-5.5 ($5 / $30) — for any net-new build. Materially better quality, 1M context vs 128K, and the path forward as OpenAI shifts focus away from GPT-4o.
- Gemini 3.1 Pro ($3.50 / $10.50) — better successor for vision workloads. 2M context, native multimodal, only 5% more expensive on output.
- GPT-4o mini ($0.15 / $0.60) — for routing, classification, and cascade tier work where GPT-4o quality is wasted.
- DeepSeek V4 Pro ($1.74 / $3.48) — when cost dominates and the workload is not vision-heavy.
Related
- AI Model Leaderboard — quality vs price across all providers
- Per Million Tokens True Cost — hidden adders pushing bills 1.5-3x above list
- Token Cost Calculator — interactive estimator
- GPT-4o deep-dive — full benchmarks and architecture
Teams running GPT-4o alongside other providers typically front the API with Swfte Connect to route across these models behind one OpenAI-compatible surface with prompt caching and per-route fallback — useful when planning a phased migration to GPT-5.5 or Gemini 3.1 Pro.
Sources: official OpenAI pricing page, May 2026-05-06. Pricing reflects October 2024 reduction held flat through May 2026.