Qwen 3.5 Flash

QwenBalanced

ThinkingTool UseVisionStructured Output

About this model

The Qwen3.5 native vision-language Flash models are built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. Compared to the...

Performance Tier

Balanced

Qwen 3.5 Flash is a balanced model from Qwen : strong performance at a reasonable price.

Strong cost-performance ratio. Reliable for most professional use cases without premium pricing.

Pricing

This model is included in Elosia plans

Eco

Minimal cost. Ideal for very high volume or simple tasks.

Type	per 1M tokens
Input (prompt)	$0.065
Output (completion)	$0.260
Cache write	$0.081

Capabilities

Context Length1.0M

Max Output Tokens66K

TokenizerQwen3

Inputtext, image, video

Outputtext

Release DateFebruary 25, 2026

Benchmarks

General Intelligence

MMLU

Not reported

GPQA Diamond

84.2%

Mathematics

MATH-500

Not reported

Programming

HumanEval

Not reported

SWE-bench Verified

69.2%

Reasoning

IFEval

91.9%

Multimodal

MMMU-Pro

75.1%

Recommended Use Cases

General ChatCodingAnalysisResearchTranslation

Strengths

Frontier-class performance with only 3B active parameters (MoE)
Extremely affordable ($0.10/M input) with near-flagship quality
1M token context window via API
Native vision support (images, video, documents)
201 languages supported including French

Limitations

Below Qwen 3.5 27B dense on most benchmarks
Tool use score (BFCL-V4: 67.3) below competitors
No published MMLU, HumanEval, or MATH-500 scores

Resources

Official Documentation Research Paper

This model may use your data for training

Similar Models

Qwen

Claude

Claude

Claude