What is the cheapest LLM for translation in 2026?

DeepSeek V4 Flash at $0.14 input / $0.28 output per 1M tokens is the cheapest token-priced LLM, costing roughly $42 per 100K translations on the EN-ES web-copy scenario. DeepL Pro on a per-character basis is in a similar ballpark but with very different quality characteristics on long-tail languages.

Is DeepL still cheaper than LLM-based translation?

For pure European-language pairs (EN-ES, EN-FR, EN-DE), DeepL Pro at $25 per 1M characters is competitive with the cheapest LLM tier. It is materially cheaper than GPT-5.5 or Claude Opus 4.7. For non-European, low-resource, or context-sensitive translation (legal, marketing, brand voice), an LLM with explicit prompting wins.

Which LLM has the best multilingual coverage?

Gemini 3.1 Pro leads on raw multilingual benchmark scores in 2026, especially for low-resource languages (Swahili, Bengali, Tagalog). Qwen 3.6 Plus is the strongest for CJK pairs. DeepSeek V4 Pro is a value pick for high-volume EN-ZH workloads.

When should I use an LLM instead of DeepL or Google Translate?

When the translation needs context: a brand-voice glossary, a legal or medical domain, a tone instruction (formal vs casual), or a translation that must reference earlier paragraphs. LLMs follow instructions; classical NMT systems do not.

How do I cut translation costs at scale?

Three levers: (1) cache the system prompt — your brand glossary is the same on every call, (2) use batch tier (50% off on most providers) for non-real-time content, (3) cascade — push templated marketing copy to DeepSeek V4 Flash, route only nuanced or legal copy to Gemini 3.1 Pro or Claude Opus 4.7.

Cost of Translation: AI Model Pricing Compared (June 2026)

Translation is the highest-volume LLM workload in production today. We price the canonical web-copy translation (EN-ES, ~1K in / 1.2K out) across every major LLM, and compare it to DeepL Pro on the same workload.

The reference scenario

Task: Translate 1,000 input tokens to 1,200 output tokens (typical EN→ES web copy)
Input tokens per call: 1,000
Output tokens per call: 1,200
Monthly volume: 100,000 translations (e-commerce / SaaS localization workload)
Total tokens / month: 220M

Output tokens exceed input tokens because Spanish averages ~20% longer than English at the token level.

Cost across 10 models, sorted cheapest first

Rank	Model	Per call	Per month	vs cheapest
1	DeepSeek V4 Flash	$0.000340	$34.00	—
2	Gemini 2.0 Flash	$0.000580	$58.00	1.7x
3	Claude 3.5 Haiku	$0.0056	$560	16.5x
4	DeepSeek V4 Pro	$0.0059	$592	17.4x
5	Qwen 3.6 Plus	$0.0081	$812	23.9x
6	Gemini 3.1 Pro	$0.0164	$1,640	48.2x
7	Claude Sonnet 4	$0.0210	$2,100	61.8x
8	Claude Opus 4.7	$0.0350	$3,500	102.9x
9	GPT-5.5	$0.0410	$4,100	120.6x
10	GPT-5.5 Pro	$0.2460	$24,600	723.5x

DeepL Pro reference on the same workload (per-character pricing, ~5K chars per call): ~$0.000125 per call / ~$12.50 per month. Not in the table because the pricing model is different (per-character, not per-token).

Monthly spend at 100K translations

DeepSeek V4 Flash      #................................... $34.00
Gemini 2.0 Flash       #................................... $58.00
Claude 3.5 Haiku       #................................... $560
DeepSeek V4 Pro        #................................... $592
Qwen 3.6 Plus          #................................... $812
Gemini 3.1 Pro         ##.................................. $1,640
Claude Sonnet 4        ###................................. $2,100
Claude Opus 4.7        #####............................... $3,500
GPT-5.5                ######.............................. $4,100
GPT-5.5 Pro            #################################### $24,600

Per-call cost

DeepSeek V4 Flash      #............................. $0.000340
Gemini 2.0 Flash       #............................. $0.000580
Claude 3.5 Haiku       #............................. $0.0056
DeepSeek V4 Pro        #............................. $0.0059
Qwen 3.6 Plus          #............................. $0.0081
Gemini 3.1 Pro         ##............................ $0.0164
Claude Sonnet 4        ###........................... $0.0210
Claude Opus 4.7        ####.......................... $0.0350
GPT-5.5                #####......................... $0.0410
GPT-5.5 Pro            ############################## $0.2460

Which model wins for translation?

For purely European languages (EN-ES, EN-FR, EN-DE, EN-IT): DeepL Pro is the value leader and quality leader. It is purpose-built for translation, has explicit glossary support, and the per-character pricing is competitive with the cheapest LLM tier. Most teams shipping European localization at scale use DeepL by default.

Recommended LLM pick: Gemini 3.1 Pro is our top LLM-based pick. The multilingual coverage is best-in-class, the long context allows you to send a brand-voice glossary alongside the source text, and the $3.50 / $10.50 per 1M tokens pricing is reasonable. Runner-up: DeepSeek V4 Pro, which is roughly 6x cheaper than Gemini 3.1 Pro and very strong for EN-ZH and CJK language pairs.

When to use a cheap model

Templated copy: product descriptions, FAQs, terms-of-service updates
High-volume bulk localization where human review is in the loop
Internal tooling translation (admin UIs, error messages)
Short strings (under 200 tokens each)
Languages well-represented in the model (EN-ES, EN-PT, EN-ZH)

When to use a frontier model

Marketing copy where brand voice matters
Legal / contractual / medical translation
Low-resource languages (Swahili, Bengali, Tagalog, Yoruba)
Translation with embedded context (subtitles, dialog, conversational tone)
Translation + transcreation (cultural adaptation, not just literal)

DeepL is not on the token table — but should be in your stack

DeepL prices per character, not per token. On a typical 5K-char EN-ES translation, DeepL Pro costs around $0.000125, putting it in the same ballpark as DeepSeek V4 Flash. The advantage of DeepL is purpose-built quality on European pairs and an explicit glossary product. The disadvantage is no instruction following — you cannot ask DeepL to "translate in a friendly tone, prefer Latin American Spanish, and keep technical terms in English." For that, an LLM is the only option.

Pricing data sourced from official provider pages and OpenRouter, May 2026-05-06. DeepL pricing reflects the Pro plan published rate. Effective production cost will be 1.5-2x higher after retries, system prompts, and priority-tier surcharges.