AI Model Pricing Index
Compare API pricing across every major AI provider. Sortable table, historical trends, and an interactive cost calculator to estimate your monthly spend.
36
Models Tracked
13
Providers
$0.10
Cheapest Input
300x
Price Range
Full Pricing Table
| Model | Provider | Input / 1M | Output / 1M | Blended | Quality | Value | Context |
|---|---|---|---|---|---|---|---|
Gemma 4 27BOSS Self-hosted general purpose | Self-host | Self-host | Self-host | 75 | 0.0 | 128K | |
Open multimodal | Mistral AI | Self-host | Self-host | Self-host | 76 | 0.0 | 256K |
Cheap-and-fast cascade tier | DeepSeek | $0.14 | $0.28 | $0.21 | 80 | 381.0 | 1M |
Fastest + cheapest | $0.10 | $0.40 | $0.25 | 74 | 296.0 | 1M | |
Longest context | Meta | $0.15 | $0.40 | $0.28 | 71 | 258.2 | 10M |
Open-source coding | Alibaba Cloud | $0.15 | $0.45 | $0.30 | 74 | 246.7 | 131K |
High throughput | OpenAI | $0.15 | $0.60 | $0.38 | 72 | 192.0 | 128K |
Open-source value | Meta | $0.20 | $0.60 | $0.40 | 80 | 200.0 | 1M |
Budget reasoning | xAI | $0.30 | $0.50 | $0.40 | 78 | 195.0 | 131K |
Code generation | Mistral AI | $0.30 | $0.90 | $0.60 | 76 | 126.7 | 256K |
Qwen 2.5 72BOSS Open-source flagship | Alibaba Cloud | $0.30 | $0.90 | $0.60 | 80 | 133.3 | 131K |
DeepSeek V3OSS Best open-source value | DeepSeek | $0.27 | $1.10 | $0.69 | 86 | 125.5 | 128K |
DeepSeek R1OSS Cheap reasoning | DeepSeek | $0.55 | $2.19 | $1.37 | 91 | 66.4 | 128K |
Grok 4.3 New Agentic tasks & real-time info | xAI | $1.25 | $2.50 | $1.88 | 93 | 49.6 | 1M |
AWS ecosystem | Amazon | $0.80 | $3.20 | $2.00 | 70 | 35.0 | 300K |
Claude 3.5 Haiku-20% Speed & cost | Anthropic | $0.80 | $4.00 | $2.40 | 75 | 31.3 | 200K |
Frontier quality at low cost | Moonshot AI | $0.95 | $4.00 | $2.48 | 92 | 37.2 | 256K |
Open-source value leader | DeepSeek | $1.74 | $3.48 | $2.61 | 90 | 34.5 | 1M |
Reasoning & math | OpenAI | $1.10 | $4.40 | $2.75 | 88 | 32.0 | 200K |
Open-weight agentic & tool use | Z.ai (Zhipu AI) | $1.55 | $4.65 | $3.10 | 88 | 28.4 | 200K |
Multilingual & APAC | Alibaba Cloud | $1.40 | $5.60 | $3.50 | 86 | 24.6 | 256K |
Mistral Large 2-33% Multilingual | Mistral AI | $2.00 | $6.00 | $4.00 | 79 | 19.8 | 128K |
Long context | OpenAI | $2.00 | $8.00 | $5.00 | 89 | 17.8 | 1M |
Multimodal + value | $1.25 | $10.00 | $5.63 | 92 | 16.4 | 1M | |
General purpose | OpenAI | $2.50 | $10.00 | $6.25 | 85 | 13.6 | 128K |
Enterprise RAG | Cohere | $2.50 | $10.00 | $6.25 | 68 | 10.9 | 128K |
Science & long-context | $2.00 | $12.00 | $7.00 | 96 | 13.7 | 1M | |
Coding & balance | Anthropic | $3.00 | $15.00 | $9.00 | 88 | 9.8 | 200K |
Real-time info | xAI | $3.00 | $15.00 | $9.00 | 87 | 9.7 | 131K |
Search + citations | Perplexity | $3.00 | $15.00 | $9.00 | 78 | 8.7 | 200K |
Coding & balance | Anthropic | $3.00 | $15.00 | $9.00 | 90 | 10.0 | 1M |
Coding & agentic workflows | Anthropic | $5.00 | $25.00 | $15.00 | 96 | 6.4 | 1M |
GPT-5.5 New Frontier general purpose | OpenAI | $5.00 | $30.00 | $17.50 | 97 | 5.5 | 1M |
Hard reasoning | OpenAI | $10.00 | $40.00 | $25.00 | 94 | 3.8 | 200K |
Complex analysis | Anthropic | $15.00 | $75.00 | $45.00 | 91 | 2.0 | 200K |
GPT-5.5 Pro New Reasoning at any cost | OpenAI | $30.00 | $180.00 | $105.00 | 98 | 0.9 | 1M |
Estimate Your Monthly Cost
Monthly cost estimate
Enter your typical request shape. Costs below are projected over one month, based on current public list-price API rates.
Cheapest
DeepSeek V4 Flash
$15.40
per month at this volume
Best value (quality ≥ 80)
DeepSeek V4 Flash · Q 80
$15.40
per month at this volume
Most expensive
GPT-5.5 Pro
$6900.00
per month at this volume
Save 30-60% with Mixture-of-Routers
Most production traffic is mixed-difficulty. Send the easy 60% to a cheap model and the hard 10% to a frontier model — same quality, fraction of the cost.
Full breakdown by model
Sorted cheapest to most expensive
| Model | Cost / request | Input cost / mo | Output cost / mo | Total / mo |
|---|---|---|---|---|
DeepSeek V4 Flash $0.14 in / $0.28 out per 1M | $0.000154 | $7.00 | $8.40 | $15.40 |
Gemini 2.0 Flash $0.1 in / $0.4 out per 1M | $0.000170 | $5.00 | $12.00 | $17.00 |
Llama 4 Scout $0.15 in / $0.4 out per 1M | $0.000195 | $7.50 | $12.00 | $19.50 |
Qwen 2.5 Coder 32B $0.15 in / $0.45 out per 1M | $0.000210 | $7.50 | $13.50 | $21.00 |
GPT-4o Mini $0.15 in / $0.6 out per 1M | $0.000255 | $7.50 | $18.00 | $25.50 |
Llama 4 Maverick $0.2 in / $0.6 out per 1M | $0.000280 | $10.00 | $18.00 | $28.00 |
Grok 3 Mini $0.3 in / $0.5 out per 1M | $0.000300 | $15.00 | $15.00 | $30.00 |
Codestral $0.3 in / $0.9 out per 1M | $0.000420 | $15.00 | $27.00 | $42.00 |
Qwen 2.5 72B $0.3 in / $0.9 out per 1M | $0.000420 | $15.00 | $27.00 | $42.00 |
DeepSeek V3 $0.27 in / $1.1 out per 1M | $0.000465 | $13.50 | $33.00 | $46.50 |
DeepSeek R1 $0.55 in / $2.19 out per 1M | $0.000932 | $27.50 | $65.70 | $93.20 |
Amazon Nova Pro $0.8 in / $3.2 out per 1M | $0.001360 | $40.00 | $96.00 | $136.00 |
Grok 4.3 $1.25 in / $2.5 out per 1M | $0.001375 | $62.50 | $75.00 | $137.50 |
Claude 3.5 Haiku $0.8 in / $4 out per 1M | $0.001600 | $40.00 | $120.00 | $160.00 |
Kimi K2.6 $0.95 in / $4 out per 1M | $0.001675 | $47.50 | $120.00 | $167.50 |
o3 Mini $1.1 in / $4.4 out per 1M | $0.001870 | $55.00 | $132.00 | $187.00 |
DeepSeek V4 Pro $1.74 in / $3.48 out per 1M | $0.001914 | $87.00 | $104.40 | $191.40 |
GLM-5.1 $1.55 in / $4.65 out per 1M | $0.002170 | $77.50 | $139.50 | $217.00 |
Qwen 3.6 Plus $1.4 in / $5.6 out per 1M | $0.002380 | $70.00 | $168.00 | $238.00 |
Mistral Large 2 $2 in / $6 out per 1M | $0.002800 | $100.00 | $180.00 | $280.00 |
GPT-4.1 $2 in / $8 out per 1M | $0.003400 | $100.00 | $240.00 | $340.00 |
Gemini 2.5 Pro $1.25 in / $10 out per 1M | $0.003625 | $62.50 | $300.00 | $362.50 |
GPT-4o $2.5 in / $10 out per 1M | $0.004250 | $125.00 | $300.00 | $425.00 |
Command R+ $2.5 in / $10 out per 1M | $0.004250 | $125.00 | $300.00 | $425.00 |
Gemini 3.1 Pro $2 in / $12 out per 1M | $0.004600 | $100.00 | $360.00 | $460.00 |
Claude Sonnet 4 $3 in / $15 out per 1M | $0.006000 | $150.00 | $450.00 | $600.00 |
Grok 3 $3 in / $15 out per 1M | $0.006000 | $150.00 | $450.00 | $600.00 |
Sonar Pro $3 in / $15 out per 1M | $0.006000 | $150.00 | $450.00 | $600.00 |
Claude Sonnet 4.6 $3 in / $15 out per 1M | $0.006000 | $150.00 | $450.00 | $600.00 |
Claude Opus 4.7 $5 in / $25 out per 1M | $0.0100 | $250.00 | $750.00 | $1000.00 |
GPT-5.5 $5 in / $30 out per 1M | $0.0115 | $250.00 | $900.00 | $1150.00 |
o3 $10 in / $40 out per 1M | $0.0170 | $500.00 | $1200.00 | $1700.00 |
Claude Opus 4 $15 in / $75 out per 1M | $0.0300 | $750.00 | $2250.00 | $3000.00 |
GPT-5.5 Pro $30 in / $180 out per 1M | $0.0690 | $1500.00 | $5400.00 | $6900.00 |
Gemma 4 27B Self-hostOpen weights (Apache 2.0) — token cost is $0; infra cost depends on hardware | — | — | — | Self-host |
Nemotron 3 Nano Omni Self-hostOpen weights (NVIDIA Open Model License) — token cost is $0; infra cost depends on hardware | — | — | — | Self-host |
List-price estimate. Real bills typically run 1.3-1.7x higher after retries, system-prompt re-sends, and tool-call round-trips. See per-million-tokens true cost for the adders.
Recent Price Changes
Claude 3.5 Haiku
Feb 1, 2025
$0.8 / $4
Mistral Large 2
Nov 18, 2024
$2 / $6
GPT-4o
Oct 1, 2024
$2.5 / $10
Command R+
Aug 15, 2024
$2.5 / $10
Understanding AI API Pricing in 2026
AI model pricing has undergone a dramatic transformation. Since GPT-4 launched in March 2023 at $30 per million input tokens, prices have fallen by over 90% — driven by competition from Anthropic, Google, and open-source challengers like DeepSeek and Meta's Llama.
Today's pricing landscape spans a 150x range: from Google's Gemini 2.0 Flash at $0.10/1M input tokens to Claude Opus 4 at $15/1M tokens. The key insight is that price doesn't always correlate with quality — DeepSeek V3 delivers 86% quality at just $0.27/1M tokens, while some premium models charge 50x more for marginal quality gains.
How to Optimize AI API Costs
The most effective strategy is model routing: sending simple queries to cheap, fast models and complex queries to premium models. A gateway like Swfte Connect automates this, typically reducing costs by 30-60% without sacrificing quality.
Other strategies include: leveraging cached input pricing (offered by Google and DeepSeek), batching requests to reduce per-call overhead, and using open-source models for predictable workloads where you can self-host.
Pricing Trends to Watch
- Price compression continues: Expect another 50%+ reduction across flagship models by end of 2026
- Reasoning premium: Models with extended thinking (o3, R1) cost more due to higher compute per request
- Open-source pressure: Llama 4 and DeepSeek are forcing closed providers to cut prices faster
- Cached pricing: More providers offering discounted rates for repeated context