The Qwen3.5 native vision-language Flash models are built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. Compared to the...
Performance Tier
Balanced
Qwen 3.5 Flash is a balanced model from Qwen : strong performance at a reasonable price.
Strong cost-performance ratio. Reliable for most professional use cases without premium pricing.
Pricing
This model is included in Elosia plans
Eco
Minimal cost. Ideal for very high volume or simple tasks.
Type
per 1M tokens
Input (prompt)
$0.065
Output (completion)
$0.260
Cache write
$0.081
Capabilities
Context Length1.0M
Max Output Tokens66K
TokenizerQwen3
Inputtext, image, video
Outputtext
Release DateFebruary 25, 2026
Benchmarks
General Intelligence
MMLU
Not reported
GPQA Diamond
84.2%
Mathematics
MATH-500
Not reported
Programming
HumanEval
Not reported
SWE-bench Verified
69.2%
Reasoning
IFEval
91.9%
Multimodal
MMMU-Pro
75.1%
Recommended Use Cases
General ChatCodingAnalysisResearchTranslation
Strengths
Frontier-class performance with only 3B active parameters (MoE)
Extremely affordable ($0.10/M input) with near-flagship quality