Updated Apr 6, 2026

AI Model Pricing Index

Compare API pricing across every major AI provider. Sortable table, historical trends, and an interactive cost calculator to estimate your monthly spend.

23

Models Tracked

11

Providers

$0.10

Cheapest Input

150x

Price Range

Full Pricing Table

23 models
ModelProviderInput / 1MOutput / 1MBlendedQualityValueContext

Fastest + cheapest

Google$0.10$0.40$0.25
74
296.01M

Longest context

Meta$0.15$0.40$0.28
71
258.210M

Open-source coding

Alibaba Cloud$0.15$0.45$0.30
74
246.7131K

High throughput

OpenAI$0.15$0.60$0.38
72
192.0128K

Open-source value

Meta$0.20$0.60$0.40
80
200.01M

Budget reasoning

xAI$0.30$0.50$0.40
78
195.0131K

Code generation

Mistral AI$0.30$0.90$0.60
76
126.7256K

Open-source flagship

Alibaba Cloud$0.30$0.90$0.60
80
133.3131K

Best open-source value

DeepSeek$0.27$1.10$0.69
86
125.5128K

Cheap reasoning

DeepSeek$0.55$2.19$1.37
91
66.4128K

AWS ecosystem

Amazon$0.80$3.20$2.00
70
35.0300K

Speed & cost

Anthropic$0.80$4.00$2.40
75
31.3200K

Reasoning & math

OpenAI$1.10$4.40$2.75
88
32.0200K

Multilingual

Mistral AI$2.00$6.00$4.00
79
19.8128K

Long context

OpenAI$2.00$8.00$5.00
89
17.81M

Multimodal + value

Google$1.25$10.00$5.63
92
16.41M

General purpose

OpenAI$2.50$10.00$6.25
85
13.6128K

Enterprise RAG

Cohere$2.50$10.00$6.25
68
10.9128K

Coding & balance

Anthropic$3.00$15.00$9.00
88
9.8200K

Real-time info

xAI$3.00$15.00$9.00
87
9.7131K

Search + citations

Perplexity$3.00$15.00$9.00
78
8.7200K

Hard reasoning

OpenAI$10.00$40.00$25.00
96
3.8200K

Complex analysis

Anthropic$15.00$75.00$45.00
95
2.1200K
Blended = avg of input + output per 1M tokensQuality = composite benchmark score (0-100)Value = quality per dollar (higher is better)

Estimate Your Monthly Cost

Cost Calculator

Cheapest

$17.00/mo

Gemini 2.0 Flash

Best Value

$28.00/mo

Llama 4 Maverick (quality >= 80)

Most Expensive

$3.0K/mo

Claude Opus 4

Save 30-60% with smart model routing

Swfte Connect automatically routes each request to the optimal model based on complexity, reducing costs without sacrificing quality.

Learn More

All Models — Estimated Monthly Cost

Gemini 2.0 Flash
$17.00/mo
Llama 4 Scout
$19.50/mo
Qwen 2.5 Coder 32B
$21.00/mo
GPT-4o Mini
$25.50/mo
Llama 4 Maverick
$28.00/mo
Grok 3 Mini
$30.00/mo
Codestral
$42.00/mo
Qwen 2.5 72B
$42.00/mo
DeepSeek V3
$46.50/mo
DeepSeek R1
$93.20/mo
Amazon Nova Pro
$136.00/mo
Claude 3.5 Haiku
$160.00/mo
o3 Mini
$187.00/mo
Mistral Large 2
$280.00/mo
GPT-4.1
$340.00/mo
Gemini 2.5 Pro
$362.50/mo
GPT-4o
$425.00/mo
Command R+
$425.00/mo
Claude Sonnet 4
$600.00/mo
Grok 3
$600.00/mo
Sonar Pro
$600.00/mo
o3
$1.7K/mo
Claude Opus 4
$3.0K/mo

Recent Price Changes

Claude 3.5 Haiku

Feb 1, 2025

$0.8 / $4

-20%

Mistral Large 2

Nov 18, 2024

$2 / $6

-33%

GPT-4o

Oct 1, 2024

$2.5 / $10

-37%

Command R+

Aug 15, 2024

$2.5 / $10

-31%

Understanding AI API Pricing in 2026

AI model pricing has undergone a dramatic transformation. Since GPT-4 launched in March 2023 at $30 per million input tokens, prices have fallen by over 90% — driven by competition from Anthropic, Google, and open-source challengers like DeepSeek and Meta's Llama.

Today's pricing landscape spans a 150x range: from Google's Gemini 2.0 Flash at $0.10/1M input tokens to Claude Opus 4 at $15/1M tokens. The key insight is that price doesn't always correlate with quality — DeepSeek V3 delivers 86% quality at just $0.27/1M tokens, while some premium models charge 50x more for marginal quality gains.

How to Optimize AI API Costs

The most effective strategy is model routing: sending simple queries to cheap, fast models and complex queries to premium models. A gateway like Swfte Connect automates this, typically reducing costs by 30-60% without sacrificing quality.

Other strategies include: leveraging cached input pricing (offered by Google and DeepSeek), batching requests to reduce per-call overhead, and using open-source models for predictable workloads where you can self-host.

Pricing Trends to Watch

  • Price compression continues: Expect another 50%+ reduction across flagship models by end of 2026
  • Reasoning premium: Models with extended thinking (o3, R1) cost more due to higher compute per request
  • Open-source pressure: Llama 4 and DeepSeek are forcing closed providers to cut prices faster
  • Cached pricing: More providers offering discounted rates for repeated context