DeepSeek

DeepSeek v4 Flash

DeepSeekBalanced
ThinkingTool UseStructured Output

About this model

DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepSeek with 284B total parameters and 13B activated parameters, supporting a 1M-token context window. It is designed for fast inference and...

Performance Tier

Balanced

DeepSeek v4 Flash is a balanced model from DeepSeek : strong performance at a reasonable price.

Strong cost-performance ratio. Reliable for most professional use cases without premium pricing.

Pricing

This model is included in Elosia plans
Eco

Minimal cost. Ideal for very high volume or simple tasks.

Typeper 1M tokens
Input (prompt)$0.140
Output (completion)$0.280
Cache read$0.0028

Capabilities

Context Length1.0M
Max Output Tokens384K
TokenizerDeepSeek
Inputtext
Outputtext
Release DateApril 24, 2026

Benchmarks

General Intelligence
MMLU
88.7%
MMLU-Pro
86.2%
GPQA Diamond
88.1%
Mathematics
MATH-500
Not reported
Programming
HumanEval
69.5%
SWE-bench Verified
79%
LiveCodeBench
91.6%
Agentic
Terminal-Bench 2.0
56.9%

Recommended Use Cases

CodingMathematicsAnalysisGeneral Chat

Strengths

  • Outstanding cost-to-performance — ~3x cheaper than v4 Pro for near-frontier reasoning
  • MoE 284B total / 13B active — high throughput ideal for agents and coding assistants
  • 1M-token context with the same hybrid sparse attention as v4 Pro
  • Open-weight MIT — deployable on-prem

Limitations

  • Lower factual knowledge density than v4 Pro for recall-heavy tasks
  • Max reasoning mode adds significant latency — not ideal for real-time UX
  • Smaller MoE — weaker performance on creative writing vs proprietary models

Resources

This model may use your data for training

Similar Models