Question 1

How much does DeepSeek V4 Pro cost per 1M tokens in May 2026?

Accepted Answer

List price on the official DeepSeek API is $1.74 input and $3.48 output per 1M tokens. Cached input is 90% off ($0.174). Apache 2.0 license also makes the weights freely available — self-hosted token cost is $0; you pay GPU infra (8x H100 or 4x H200 typical for full Pro deployment).

Question 2

Is self-hosting DeepSeek V4 Pro actually cheaper than the API?

Accepted Answer

Break-even depends on volume. A reserved 8x H100 node costs roughly $20-25/hour on most clouds (~$15-18K/month). At DeepSeek list rates, that is ~5-6B tokens/month of break-even. Below that, the API wins. Above, self-host wins on cost — and always wins on data sovereignty.

Question 3

What is the catch with DeepSeek pricing?

Accepted Answer

Tool-use and agentic reliability still trail Claude Opus 4.7 and GPT-5.5 by a meaningful margin in production. Function-calling success rates on long tool chains are several points lower. For chat, RAG, classification, and one-shot generation, V4 Pro is the best dollar-for-dollar choice; for 12+ turn agentic loops, the cheaper model can become more expensive once you count retry rate.

Question 4

How does DeepSeek V4 Pro compare to Claude Opus 4.7 on cost?

Accepted Answer

V4 Pro is roughly 3x cheaper on input ($1.74 vs $5) and 7x cheaper on output ($3.48 vs $25). Quality gap on Arena is ~35 Elo (1462 vs 1497) and meaningfully larger on coding-specific benchmarks. For non-coding workloads at scale, V4 Pro typically wins on bill; for engineering, Opus 4.7 still pays back its premium.

Question 5

What are the data and licensing terms?

Accepted Answer

V4 Pro weights are released under Apache 2.0 — fully permissive, including commercial use and modification. The official DeepSeek API processes data in PRC infrastructure, which may raise sovereignty or regulatory concerns; self-hosting or hosting via Together / Fireworks resolves that.

Tier	Input /1M	Output /1M	Notes
Standard (DeepSeek API)	$1.740	$3.48	List price on the official DeepSeek API. Lowest frontier-tier rate by a wide margin.
Cached input (DeepSeek)	$0.174	$3.48	Context caching: 90% off list on input cache hits. Aggressive — matches Anthropic on the discount rate but at a far lower base.
Self-hosted (Apache 2.0)	$0.000	$0.00	Per-token cost is $0 — you pay GPU infra instead. 1.6T MoE / 49B active, runs on 8x H100 or 4x H200. Operating cost dominated by GPU-hours, not tokens.
Hosted via partners	$1.400	$2.80	Together AI, Fireworks, and others typically host at 15-25% below DeepSeek list. Quality and tool-use parity vary; benchmark before adopting.

Task	In tok	Out tok	DeepSeek API	Cached	Partner
Short chat reply	800	200	$0.0021	$0.0008	$0.0017
Long-doc summary (50-page PDF)	80,000	1,500	$0.1444	$0.0191	$0.1162
Agentic loop (12 tool turns)	45,000	6,000	$0.0992	$0.0287	$0.0798
RAG query (10-doc context)	12,000	600	$0.0230	$0.0042	$0.0185

Model	In /1M	Out /1M	Context	Note
DeepSeek V4 Pro	$1.74	$3.48	1M	This page. Apache 2.0, 1.6T MoE / 49B active. Best price-per-quality of any frontier model.
Claude Opus 4.7	$5.00	$25.00	1M	~3x input, ~7x output. Coding Arena #1 — pays back on engineering work, not on chat.
GPT-5.5	$5.00	$30.00	1M	~3x input, ~9x output. Stronger ecosystem and voice tooling.
Gemini 3.1 Pro	$3.50	$10.50	2M	2x input, 3x output. Better long-context and multimodal.
DeepSeek V4 Flash	$0.14	$0.28	1M	Same family, ~12x cheaper. Apache 2.0. Use as cascade tier for trivial work.

DeepSeek V4 Pro Cost & Pricing (May 2026)

Pricing tiers — every way to buy or run V4 Pro

Cost per task — what you actually pay

V4 Pro vs nearest alternatives

When DeepSeek V4 Pro is the right choice

When to switch to a more expensive alternative

Related