Qwen

Qwen 3.5 Flash

QwenBalanced
ThinkingTool UseVisionStructured Output

About this model

The Qwen3.5 native vision-language Flash models are built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. Compared to the...

Performance Tier

Balanced

Qwen 3.5 Flash is a balanced model from Qwen : strong performance at a reasonable price.

Strong cost-performance ratio. Reliable for most professional use cases without premium pricing.

Pricing

This model is included in Elosia plans
Eco

Minimal cost. Ideal for very high volume or simple tasks.

Typeper 1M tokens
Input (prompt)$0.065
Output (completion)$0.260
Cache write$0.081

Capabilities

Context Length1.0M
Max Output Tokens66K
TokenizerQwen3
Inputtext, image, video
Outputtext
Release DateFebruary 25, 2026

Benchmarks

General Intelligence
MMLU
Not reported
GPQA Diamond
84.2%
Mathematics
MATH-500
Not reported
Programming
HumanEval
Not reported
SWE-bench Verified
69.2%
Reasoning
IFEval
91.9%
Multimodal
MMMU-Pro
75.1%

Recommended Use Cases

General ChatCodingAnalysisResearchTranslation

Strengths

  • Frontier-class performance with only 3B active parameters (MoE)
  • Extremely affordable ($0.10/M input) with near-flagship quality
  • 1M token context window via API
  • Native vision support (images, video, documents)
  • 201 languages supported including French

Limitations

  • Below Qwen 3.5 27B dense on most benchmarks
  • Tool use score (BFCL-V4: 67.3) below competitors
  • No published MMLU, HumanEval, or MATH-500 scores

Resources

This model may use your data for training

Similar Models