DeepSeek v4 Flash

DeepSeekBalanced

ThinkingTool UseStructured Output

About this model

DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepSeek with 284B total parameters and 13B activated parameters, supporting a 1M-token context window. It is designed for fast inference and...

Performance Tier

Balanced

DeepSeek v4 Flash is a balanced model from DeepSeek : strong performance at a reasonable price.

Strong cost-performance ratio. Reliable for most professional use cases without premium pricing.

Pricing

This model is included in Elosia plans

Eco

Minimal cost. Ideal for very high volume or simple tasks.

Type	per 1M tokens
Input (prompt)	$0.140
Output (completion)	$0.280
Cache read	$0.0028

Capabilities

Context Length1.0M

Max Output Tokens384K

TokenizerDeepSeek

Inputtext

Outputtext

Release DateApril 24, 2026

Benchmarks

General Intelligence

MMLU

88.7%

MMLU-Pro

86.2%

GPQA Diamond

88.1%

Mathematics

MATH-500

Not reported

Programming

HumanEval

69.5%

SWE-bench Verified

79%

LiveCodeBench

91.6%

Agentic

Terminal-Bench 2.0

56.9%

Recommended Use Cases

CodingMathematicsAnalysisGeneral Chat

Strengths

Outstanding cost-to-performance — ~3x cheaper than v4 Pro for near-frontier reasoning
MoE 284B total / 13B active — high throughput ideal for agents and coding assistants
1M-token context with the same hybrid sparse attention as v4 Pro
Open-weight MIT — deployable on-prem

Limitations

Lower factual knowledge density than v4 Pro for recall-heavy tasks
Max reasoning mode adds significant latency — not ideal for real-time UX
Smaller MoE — weaker performance on creative writing vs proprietary models

Resources

Official Documentation

This model may use your data for training

Similar Models

DeepSeek

DeepSeek

DeepSeek

Claude