AI Cost Calculator

Enter your usage and instantly compare monthly costs across 36 AI models. Find the cheapest option for your workload.

Monthly cost estimate

Enter your typical request shape. Costs below are projected over one month, based on current public list-price API rates.

Per month: 100K requests · 50.0M input tokens · 30.0M output tokens. Excludes prompt caching, batch discounts, retries, and fees.

Cheapest

DeepSeek V4 Flash

$15.40

per month at this volume

Best value (quality ≥ 80)

DeepSeek V4 Flash · Q 80

$15.40

per month at this volume

Most expensive

GPT-5.5 Pro

$6900.00

per month at this volume

Save 30-60% with Mixture-of-Routers

Most production traffic is mixed-difficulty. Send the easy 60% to a cheap model and the hard 10% to a frontier model — same quality, fraction of the cost.

See the math

Full breakdown by model

Sorted cheapest to most expensive

ModelCost / requestInput cost / moOutput cost / moTotal / mo

DeepSeek V4 Flash

$0.14 in / $0.28 out per 1M

$0.000154$7.00$8.40$15.40

Gemini 2.0 Flash

$0.1 in / $0.4 out per 1M

$0.000170$5.00$12.00$17.00

Llama 4 Scout

$0.15 in / $0.4 out per 1M

$0.000195$7.50$12.00$19.50

Qwen 2.5 Coder 32B

$0.15 in / $0.45 out per 1M

$0.000210$7.50$13.50$21.00

GPT-4o Mini

$0.15 in / $0.6 out per 1M

$0.000255$7.50$18.00$25.50

Llama 4 Maverick

$0.2 in / $0.6 out per 1M

$0.000280$10.00$18.00$28.00

Grok 3 Mini

$0.3 in / $0.5 out per 1M

$0.000300$15.00$15.00$30.00

Codestral

$0.3 in / $0.9 out per 1M

$0.000420$15.00$27.00$42.00

Qwen 2.5 72B

$0.3 in / $0.9 out per 1M

$0.000420$15.00$27.00$42.00

DeepSeek V3

$0.27 in / $1.1 out per 1M

$0.000465$13.50$33.00$46.50

DeepSeek R1

$0.55 in / $2.19 out per 1M

$0.000932$27.50$65.70$93.20

Amazon Nova Pro

$0.8 in / $3.2 out per 1M

$0.001360$40.00$96.00$136.00

Grok 4.3

$1.25 in / $2.5 out per 1M

$0.001375$62.50$75.00$137.50

Claude 3.5 Haiku

$0.8 in / $4 out per 1M

$0.001600$40.00$120.00$160.00

Kimi K2.6

$0.95 in / $4 out per 1M

$0.001675$47.50$120.00$167.50

o3 Mini

$1.1 in / $4.4 out per 1M

$0.001870$55.00$132.00$187.00

DeepSeek V4 Pro

$1.74 in / $3.48 out per 1M

$0.001914$87.00$104.40$191.40

GLM-5.1

$1.55 in / $4.65 out per 1M

$0.002170$77.50$139.50$217.00

Qwen 3.6 Plus

$1.4 in / $5.6 out per 1M

$0.002380$70.00$168.00$238.00

Mistral Large 2

$2 in / $6 out per 1M

$0.002800$100.00$180.00$280.00

GPT-4.1

$2 in / $8 out per 1M

$0.003400$100.00$240.00$340.00

Gemini 2.5 Pro

$1.25 in / $10 out per 1M

$0.003625$62.50$300.00$362.50

GPT-4o

$2.5 in / $10 out per 1M

$0.004250$125.00$300.00$425.00

Command R+

$2.5 in / $10 out per 1M

$0.004250$125.00$300.00$425.00

Gemini 3.1 Pro

$2 in / $12 out per 1M

$0.004600$100.00$360.00$460.00

Claude Sonnet 4

$3 in / $15 out per 1M

$0.006000$150.00$450.00$600.00

Grok 3

$3 in / $15 out per 1M

$0.006000$150.00$450.00$600.00

Sonar Pro

$3 in / $15 out per 1M

$0.006000$150.00$450.00$600.00

Claude Sonnet 4.6

$3 in / $15 out per 1M

$0.006000$150.00$450.00$600.00

Claude Opus 4.7

$5 in / $25 out per 1M

$0.0100$250.00$750.00$1000.00

GPT-5.5

$5 in / $30 out per 1M

$0.0115$250.00$900.00$1150.00

o3

$10 in / $40 out per 1M

$0.0170$500.00$1200.00$1700.00

Claude Opus 4

$15 in / $75 out per 1M

$0.0300$750.00$2250.00$3000.00

GPT-5.5 Pro

$30 in / $180 out per 1M

$0.0690$1500.00$5400.00$6900.00

Gemma 4 27B

Self-host

Open weights (Apache 2.0) — token cost is $0; infra cost depends on hardware

Self-host

Nemotron 3 Nano Omni

Self-host

Open weights (NVIDIA Open Model License) — token cost is $0; infra cost depends on hardware

Self-host

List-price estimate. Real bills typically run 1.3-1.7x higher after retries, system-prompt re-sends, and tool-call round-trips. See per-million-tokens true cost for the adders.

How AI API Pricing Works

AI model providers charge based on tokens — the basic unit of text processing. One token is roughly 4 characters or ¾ of a word. Most providers charge separately for input tokens (your prompt) and output tokens (the model's response), with output tokens typically costing 2-5x more than input tokens.

Typical Usage Patterns

  • Chatbot (customer support): ~500 input tokens, ~300 output tokens per message, 50K-500K messages/month
  • Code generation: ~1,000 input tokens, ~500 output tokens per request, 10K-100K requests/month
  • Document analysis: ~2,000 input tokens, ~200 output tokens per document, 5K-50K documents/month
  • Content generation: ~300 input tokens, ~1,000 output tokens per piece, 1K-20K pieces/month

Cost Optimization Strategies

The most impactful strategy is intelligent model routing. Rather than sending every request to a premium model, analyze the complexity of each request and route simple ones to cheaper, faster models. Swfte Connect does this automatically, typically reducing API costs by 30-60%.

Other strategies include: using cached input pricing (available from Google and DeepSeek), optimizing prompts to reduce token usage, batching API calls, and self-hosting open-source models for predictable, high-volume workloads.

Comparing Providers

See our full pricing index for a comprehensive comparison of all providers, including historical pricing trends. Or check the model leaderboard to understand the quality vs. cost tradeoffs.