How much does DeepSeek V4 Flash cost?

DeepSeek V4 Flash costs $0.1 per million input tokens and $0.2 per million output tokens. The blended average is $0.15 per million tokens.

What is the context window of DeepSeek V4 Flash?

DeepSeek V4 Flash supports a context window of 1000K tokens (1,000,000 tokens) and can generate up to 16K output tokens per request.

What is DeepSeek V4 Flash best for?

DeepSeek V4 Flash is best for: Cheap-and-fast cascade tier. It supports chat, function_calling, code capabilities. It is open-source under the Apache 2.0 license.

DeepSeek V4 Flash — Pricing, Benchmarks & Specs

DeepSeek V4 Flash

DeepSeekfastOpen Source

284B MoE / 13B active. Apache 2.0, 1M context. Among the cheapest frontier-adjacent models — $0.10 input, $0.20 output per 1M tokens.

Context Window

tokens

Max Output

16K

tokens

Input Price

$0.1

per 1M tokens

Output Price

$0.2

per 1M tokens

Speed

105

tokens/sec

Released

Apr 2026

2026-04-24

Blended Cost

$0.15

per 1M tokens

Value Score

533.3

quality per $

Capabilities

ChatFunction CallingCode Generation

Benchmarks

Quality Index

MMLU Pro

80.4

HumanEval (Coding)

84.2

MATH

76.8

Arena ELO

1410

Pricing History

Apr 24, 2026$0.14 / $0.28(launch)

May 25, 2026$0.1 / $0.2current

Try in Playground Use via Swfte Connect DeepSeek Docs

Compare With

Open Source — Licensed under Apache 2.0

About DeepSeek V4 Flash

DeepSeek V4 Flash is a fast AI model by DeepSeek, released on April 24, 2026. It supports a context window of 1000K tokens and can generate up to 16K output tokens.

At $0.1 per million input tokens and $0.2 per million output tokens, its blended cost of $0.15/1M tokens makes it one of the most affordable models available. Its value score of 533.3 reflects the balance of quality and cost.

DeepSeek V4 Flash is available as an open-source model under the Apache 2.0 license, meaning you can self-host it for predictable costs or use it through API providers like Swfte Connect.

Using DeepSeek V4 Flash with Swfte

Access DeepSeek V4 Flash through Swfte Connect, our unified LLM gateway. Connect gives you a single API for 50+ models, with automatic routing, cost optimization, and fallback handling. You can also try DeepSeek V4 Flash in our AI Playground before integrating.