DeepSeek v4 Pro

DeepSeekFlagship

ThinkingTool UseStructured Output

About this model

DeepSeek V4 Pro is a large-scale Mixture-of-Experts model from DeepSeek with 1.6T total parameters and 49B activated parameters, supporting a 1M-token context window. It is designed for advanced reasoning, coding,...

Performance Tier

Flagship

DeepSeek v4 Pro is a flagship model from DeepSeek : the most capable in their lineup.

Best-in-class model from this provider. Highest performance across benchmarks, ideal for demanding tasks.

Pricing

This model is included in Elosia plans

Affordable

Low cost. Suitable for sustained use and high-volume interactions.

View detailed pricing

Type	per 1M tokens
Input (prompt)	$0.435
Output (completion)	$0.870
Cache read	$0.0036

Capabilities

Context Length1.0M

Max Output Tokens384K

TokenizerDeepSeek

Inputtext

Outputtext

Release DateApril 24, 2026

Benchmarks

General Intelligence

MMLU

90.1%

MMLU-Pro

87.5%

GPQA Diamond

90.1%

Mathematics

MATH-500

Not reported

Programming

HumanEval

76.8%

SWE-bench Verified

80.6%

LiveCodeBench

93.5%

Reasoning

IFEval

Not reported

Humanity's Last Exam

37.7%

Agentic

SWE-bench Pro

55.4%

Terminal-Bench 2.0

67.9%

Recommended Use Cases

CodingMathematicsAnalysisResearch

Strengths

Open-weight MIT — self-hostable frontier model with no vendor lock-in
Coding leader at its price point (LiveCodeBench 93.5, SWE-bench Verified 80.6)
MoE 1.6T total / 49B active params — frontier capability with efficient inference
1M-token context with hybrid sparse attention (10% of v3.2 KV-cache footprint)

Limitations

Trails frontier proprietary models on factual recall (SimpleQA-Verified 57.9 vs Gemini 75.6)
Humanity's Last Exam (37.7) below Claude Opus 4.7 and Gemini 3.1 Pro
Less extensive safety tuning than Claude/GPT

Resources

Official Documentation

This model may use your data for training

Similar Models

DeepSeek

Claude

Claude

Gemini