Minimax M3

MinimaxFlagship

ThinkingTool UseVisionStructured Output

About this model

MiniMax-M3 is a multimodal foundation model from MiniMax. It supports text, image, and video inputs with text output, a 1M-token context window, and is suited for long-horizon agentic work, coding,...

Performance Tier

Flagship

Minimax M3 is a flagship model from Minimax : the most capable in their lineup.

Best-in-class model from this provider. Highest performance across benchmarks, ideal for demanding tasks.

Pricing

This model is included in Elosia plans

Affordable

Low cost. Suitable for sustained use and high-volume interactions.

Type	per 1M tokens
Input (prompt)	$0.300
Output (completion)	$1.20
Cache read	$0.060

Capabilities

Context Length524K

Max Output Tokens512K

TokenizerOther

Inputtext, image, video

Outputtext

Release DateMay 31, 2026

Benchmarks

General Intelligence

MMLU

Not reported

GPQA Diamond

Not reported

Mathematics

MATH-500

Not reported

Programming

HumanEval

Not reported

SWE-bench Verified

Not reported

Reasoning

IFEval

Not reported

Agentic

SWE-bench Pro

59%

Recommended Use Cases

CodingAnalysisResearchData Extraction

Strengths

MoE architecture (~428B total, 23B active) — strong open-weight agentic coding (SWE-bench Pro 59.0), ahead of GPT-5.5 and Gemini 3.1 Pro
1M-token context made efficient by MiniMax Sparse Attention (MSA) — ~9× prefill and ~15× decode speedup over the prior generation at 1M tokens
Native multimodal input including video — text, image and video from the first training step, plus computer-use workflows
Cost-efficient at $0.30/M input and $1.20/M output — a fraction of comparable frontier models for agentic coding
Open-weight with a full technical report (arXiv 2606.13392) — downloadable and self-hostable

Limitations

Headline benchmarks are self-reported by MiniMax; independent verification remains limited
SWE-bench Pro 59.0 trails the leading closed model (Claude Opus 4.8, ~69) on pure code modification
~428B MoE with a custom MSA operator makes self-hosting heavyweight; contexts above 512K bill at a higher tier

Resources

Official Documentation huggingface

This model may use your data for training

Similar Models

Minimax

Claude

Claude

DeepSeek