Qwen

Qwen 3.5 Flash

QwenBalanced
ThinkingTool UseVisionStructured Output

About this model

The Qwen3.5 native vision-language Flash models are built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. Compared to the 3 series, these models deliver a leap forward in performance for both pure text and multimodal tasks, offering fast response times while balancing inference speed and overall performance.

Performance Tier

Balanced

Qwen 3.5 Flash is a balanced model from Qwen : strong performance at a reasonable price.

Strong cost-performance ratio. Reliable for most professional use cases without premium pricing.

Pricing

This model is included in Elosia plans
Typeper 1M tokens
Input (prompt)$0.065
Output (completion)$0.260

Capabilities

Context Length1.0M
Max Output Tokens66K
TokenizerQwen3
Inputtext, image, video
Outputtext
Release DateFebruary 25, 2026

Benchmarks

General Intelligence
MMLU
Not reported
GPQA Diamond
84.2%
Mathematics
MATH-500
Not reported
Programming
HumanEval
Not reported
SWE-bench Verified
69.2%
Reasoning
IFEval
91.9%
Multimodal
MMMU-Pro
75.1%

Recommended Use Cases

General ChatCodingAnalysisResearchTranslation

Strengths

  • Frontier-class performance with only 3B active parameters (MoE)
  • Extremely affordable ($0.10/M input) with near-flagship quality
  • 1M token context window via API
  • Native vision support (images, video, documents)
  • 201 languages supported including French

Limitations

  • Below Qwen 3.5 27B dense on most benchmarks
  • Tool use score (BFCL-V4: 67.3) below competitors
  • No published MMLU, HumanEval, or MATH-500 scores

Resources

This model may use your data for training

Similar Models