Grok 4.20 Beta

GrokFlagship

Performance Tier

Flagship

Grok 4.20 Beta is a flagship model from Grok : the most capable in their lineup.

Best-in-class model from this provider. Highest performance across benchmarks, ideal for demanding tasks.

Context Length—

Max Output Tokens—

General Intelligence

MMLU

Not reported

GPQA Diamond

88.5%

Mathematics

MATH-500

Not reported

Programming

HumanEval

Not reported

SWE-bench Verified

Not reported

Reasoning

IFEval

Not reported

CodingAnalysisResearchGeneral ChatCreative Writing

4-agent internal architecture reduces hallucinations by 65% (from ~12% to ~4.2%)
Industry-leading 2M token context window for massive document analysis
Native multimodal understanding (text, image, video) with real-time X data access
Top 5 on LMArena leaderboard — competitive with Claude Opus and GPT-5.4

This model may use your data for training

Gemini

Claude

Claude

GPT