Updated May 21, 2026

AI Model Pricing Index

Compare API pricing across every major AI provider. Sortable table, historical trends, and an interactive cost calculator to estimate your monthly spend.

36

Models Tracked

13

Providers

$0.10

Cheapest Input

300x

Price Range

Full Pricing Table

36 models
ModelProviderInput / 1MOutput / 1MBlendedQualityValueContext

Self-hosted general purpose

GoogleSelf-hostSelf-hostSelf-host
75
0.0128K

Open multimodal

Mistral AISelf-hostSelf-hostSelf-host
76
0.0256K

Cheap-and-fast cascade tier

DeepSeek$0.14$0.28$0.21
80
381.01M

Fastest + cheapest

Google$0.10$0.40$0.25
74
296.01M

Longest context

Meta$0.15$0.40$0.28
71
258.210M

Open-source coding

Alibaba Cloud$0.15$0.45$0.30
74
246.7131K

High throughput

OpenAI$0.15$0.60$0.38
72
192.0128K

Open-source value

Meta$0.20$0.60$0.40
80
200.01M

Budget reasoning

xAI$0.30$0.50$0.40
78
195.0131K

Code generation

Mistral AI$0.30$0.90$0.60
76
126.7256K

Open-source flagship

Alibaba Cloud$0.30$0.90$0.60
80
133.3131K

Best open-source value

DeepSeek$0.27$1.10$0.69
86
125.5128K

Cheap reasoning

DeepSeek$0.55$2.19$1.37
91
66.4128K

Agentic tasks & real-time info

xAI$1.25$2.50$1.88
93
49.61M

AWS ecosystem

Amazon$0.80$3.20$2.00
70
35.0300K

Speed & cost

Anthropic$0.80$4.00$2.40
75
31.3200K

Frontier quality at low cost

Moonshot AI$0.95$4.00$2.48
92
37.2256K

Open-source value leader

DeepSeek$1.74$3.48$2.61
90
34.51M

Reasoning & math

OpenAI$1.10$4.40$2.75
88
32.0200K
GLM-5.1 NewOSS

Open-weight agentic & tool use

Z.ai (Zhipu AI)$1.55$4.65$3.10
88
28.4200K

Multilingual & APAC

Alibaba Cloud$1.40$5.60$3.50
86
24.6256K

Multilingual

Mistral AI$2.00$6.00$4.00
79
19.8128K

Long context

OpenAI$2.00$8.00$5.00
89
17.81M

Multimodal + value

Google$1.25$10.00$5.63
92
16.41M

General purpose

OpenAI$2.50$10.00$6.25
85
13.6128K

Enterprise RAG

Cohere$2.50$10.00$6.25
68
10.9128K

Science & long-context

Google$2.00$12.00$7.00
96
13.71M

Coding & balance

Anthropic$3.00$15.00$9.00
88
9.8200K

Real-time info

xAI$3.00$15.00$9.00
87
9.7131K

Search + citations

Perplexity$3.00$15.00$9.00
78
8.7200K

Coding & balance

Anthropic$3.00$15.00$9.00
90
10.01M

Coding & agentic workflows

Anthropic$5.00$25.00$15.00
96
6.41M

Frontier general purpose

OpenAI$5.00$30.00$17.50
97
5.51M

Hard reasoning

OpenAI$10.00$40.00$25.00
94
3.8200K

Complex analysis

Anthropic$15.00$75.00$45.00
91
2.0200K

Reasoning at any cost

OpenAI$30.00$180.00$105.00
98
0.91M
Blended = avg of input + output per 1M tokensQuality = composite benchmark score (0-100)Value = quality per dollar (higher is better)

Estimate Your Monthly Cost

Monthly cost estimate

Enter your typical request shape. Costs below are projected over one month, based on current public list-price API rates.

Per month: 100K requests · 50.0M input tokens · 30.0M output tokens. Excludes prompt caching, batch discounts, retries, and fees.

Cheapest

DeepSeek V4 Flash

$15.40

per month at this volume

Best value (quality ≥ 80)

DeepSeek V4 Flash · Q 80

$15.40

per month at this volume

Most expensive

GPT-5.5 Pro

$6900.00

per month at this volume

Save 30-60% with Mixture-of-Routers

Most production traffic is mixed-difficulty. Send the easy 60% to a cheap model and the hard 10% to a frontier model — same quality, fraction of the cost.

See the math

Full breakdown by model

Sorted cheapest to most expensive

ModelCost / requestInput cost / moOutput cost / moTotal / mo

DeepSeek V4 Flash

$0.14 in / $0.28 out per 1M

$0.000154$7.00$8.40$15.40

Gemini 2.0 Flash

$0.1 in / $0.4 out per 1M

$0.000170$5.00$12.00$17.00

Llama 4 Scout

$0.15 in / $0.4 out per 1M

$0.000195$7.50$12.00$19.50

Qwen 2.5 Coder 32B

$0.15 in / $0.45 out per 1M

$0.000210$7.50$13.50$21.00

GPT-4o Mini

$0.15 in / $0.6 out per 1M

$0.000255$7.50$18.00$25.50

Llama 4 Maverick

$0.2 in / $0.6 out per 1M

$0.000280$10.00$18.00$28.00

Grok 3 Mini

$0.3 in / $0.5 out per 1M

$0.000300$15.00$15.00$30.00

Codestral

$0.3 in / $0.9 out per 1M

$0.000420$15.00$27.00$42.00

Qwen 2.5 72B

$0.3 in / $0.9 out per 1M

$0.000420$15.00$27.00$42.00

DeepSeek V3

$0.27 in / $1.1 out per 1M

$0.000465$13.50$33.00$46.50

DeepSeek R1

$0.55 in / $2.19 out per 1M

$0.000932$27.50$65.70$93.20

Amazon Nova Pro

$0.8 in / $3.2 out per 1M

$0.001360$40.00$96.00$136.00

Grok 4.3

$1.25 in / $2.5 out per 1M

$0.001375$62.50$75.00$137.50

Claude 3.5 Haiku

$0.8 in / $4 out per 1M

$0.001600$40.00$120.00$160.00

Kimi K2.6

$0.95 in / $4 out per 1M

$0.001675$47.50$120.00$167.50

o3 Mini

$1.1 in / $4.4 out per 1M

$0.001870$55.00$132.00$187.00

DeepSeek V4 Pro

$1.74 in / $3.48 out per 1M

$0.001914$87.00$104.40$191.40

GLM-5.1

$1.55 in / $4.65 out per 1M

$0.002170$77.50$139.50$217.00

Qwen 3.6 Plus

$1.4 in / $5.6 out per 1M

$0.002380$70.00$168.00$238.00

Mistral Large 2

$2 in / $6 out per 1M

$0.002800$100.00$180.00$280.00

GPT-4.1

$2 in / $8 out per 1M

$0.003400$100.00$240.00$340.00

Gemini 2.5 Pro

$1.25 in / $10 out per 1M

$0.003625$62.50$300.00$362.50

GPT-4o

$2.5 in / $10 out per 1M

$0.004250$125.00$300.00$425.00

Command R+

$2.5 in / $10 out per 1M

$0.004250$125.00$300.00$425.00

Gemini 3.1 Pro

$2 in / $12 out per 1M

$0.004600$100.00$360.00$460.00

Claude Sonnet 4

$3 in / $15 out per 1M

$0.006000$150.00$450.00$600.00

Grok 3

$3 in / $15 out per 1M

$0.006000$150.00$450.00$600.00

Sonar Pro

$3 in / $15 out per 1M

$0.006000$150.00$450.00$600.00

Claude Sonnet 4.6

$3 in / $15 out per 1M

$0.006000$150.00$450.00$600.00

Claude Opus 4.7

$5 in / $25 out per 1M

$0.0100$250.00$750.00$1000.00

GPT-5.5

$5 in / $30 out per 1M

$0.0115$250.00$900.00$1150.00

o3

$10 in / $40 out per 1M

$0.0170$500.00$1200.00$1700.00

Claude Opus 4

$15 in / $75 out per 1M

$0.0300$750.00$2250.00$3000.00

GPT-5.5 Pro

$30 in / $180 out per 1M

$0.0690$1500.00$5400.00$6900.00

Gemma 4 27B

Self-host

Open weights (Apache 2.0) — token cost is $0; infra cost depends on hardware

Self-host

Nemotron 3 Nano Omni

Self-host

Open weights (NVIDIA Open Model License) — token cost is $0; infra cost depends on hardware

Self-host

List-price estimate. Real bills typically run 1.3-1.7x higher after retries, system-prompt re-sends, and tool-call round-trips. See per-million-tokens true cost for the adders.

Recent Price Changes

Claude 3.5 Haiku

Feb 1, 2025

$0.8 / $4

-20%

Mistral Large 2

Nov 18, 2024

$2 / $6

-33%

GPT-4o

Oct 1, 2024

$2.5 / $10

-37%

Command R+

Aug 15, 2024

$2.5 / $10

-31%

Understanding AI API Pricing in 2026

AI model pricing has undergone a dramatic transformation. Since GPT-4 launched in March 2023 at $30 per million input tokens, prices have fallen by over 90% — driven by competition from Anthropic, Google, and open-source challengers like DeepSeek and Meta's Llama.

Today's pricing landscape spans a 150x range: from Google's Gemini 2.0 Flash at $0.10/1M input tokens to Claude Opus 4 at $15/1M tokens. The key insight is that price doesn't always correlate with quality — DeepSeek V3 delivers 86% quality at just $0.27/1M tokens, while some premium models charge 50x more for marginal quality gains.

How to Optimize AI API Costs

The most effective strategy is model routing: sending simple queries to cheap, fast models and complex queries to premium models. A gateway like Swfte Connect automates this, typically reducing costs by 30-60% without sacrificing quality.

Other strategies include: leveraging cached input pricing (offered by Google and DeepSeek), batching requests to reduce per-call overhead, and using open-source models for predictable workloads where you can self-host.

Pricing Trends to Watch

  • Price compression continues: Expect another 50%+ reduction across flagship models by end of 2026
  • Reasoning premium: Models with extended thinking (o3, R1) cost more due to higher compute per request
  • Open-source pressure: Llama 4 and DeepSeek are forcing closed providers to cut prices faster
  • Cached pricing: More providers offering discounted rates for repeated context