DeepSeek V4 Flash

DeepSeekfast NewOpen Source

284B MoE / 13B active. Apache 2.0, 1M context. Among the cheapest frontier-adjacent models — $0.14 input, $0.28 output per 1M tokens.

Context Window

1M

tokens

Max Output

16K

tokens

Input Price

$0.14

per 1M tokens

Output Price

$0.28

per 1M tokens

Speed

218

tokens/sec

Released

Apr 2026

2026-04-24

Blended Cost

$0.21

per 1M tokens

Value Score

371.4

quality per $

Capabilities

ChatFunction CallingCode Generation

Benchmarks

Quality Index
78
MMLU Pro
80.4
HumanEval (Coding)
84.2
MATH
76.8
Arena ELO
1392

Open Source — Licensed under Apache 2.0

About DeepSeek V4 Flash

DeepSeek V4 Flash is a fast AI model by DeepSeek, released on April 24, 2026. It supports a context window of 1000K tokens and can generate up to 16K output tokens.

At $0.14 per million input tokens and $0.28 per million output tokens, its blended cost of $0.21/1M tokens makes it one of the most affordable models available. Its value score of 371.4 reflects the balance of quality and cost.

DeepSeek V4 Flash is available as an open-source model under the Apache 2.0 license, meaning you can self-host it for predictable costs or use it through API providers like Swfte Connect.

Using DeepSeek V4 Flash with Swfte

Access DeepSeek V4 Flash through Swfte Connect, our unified LLM gateway. Connect gives you a single API for 50+ models, with automatic routing, cost optimization, and fallback handling. You can also try DeepSeek V4 Flash in our AI Playground before integrating.