GPT-5.5

GPTFlagship

ThinkingTool UseVisionStructured Output

About this model

GPT-5.5 is OpenAI’s frontier model designed for complex professional workloads, building on GPT-5.4 with stronger reasoning, higher reliability, and improved token efficiency on hard tasks. It features a 1M+ token...

Performance Tier

Flagship

GPT-5.5 is a flagship model from GPT : the most capable in their lineup.

Best-in-class model from this provider. Highest performance across benchmarks, ideal for demanding tasks.

Pricing

This model is included in Elosia plans

Premium

Highest cost level. A long conversation can quickly consume your monthly cap.

Type	per 1M tokens
Input (prompt)	$5.00
Output (completion)	$30.00
Cache read	$0.500

Capabilities

Context Length1.1M

Max Output Tokens128K

TokenizerGPT

Inputfile, image, text

Outputtext

Release DateApril 24, 2026

Benchmarks

General Intelligence

MMLU

Not reported

GPQA Diamond

93.6%

Mathematics

MATH-500

Not reported

Programming

HumanEval

Not reported

SWE-bench Verified

Not reported

Reasoning

IFEval

Not reported

ARC-AGI-2

85%

Humanity's Last Exam

41.4%

Multimodal

MMMU-Pro

81.2%

Agentic

SWE-bench Pro

58.6%

Terminal-Bench 2.0

82.7%

Recommended Use Cases

CodingAnalysisResearchGeneral ChatData Extraction

Strengths

1.05M-token context window with leading long-context retrieval (MRCR 1M ≈ 74)
Frontier agentic capabilities — leader on Terminal-Bench 2.0 (82.7) and OSWorld-Verified (78.7)
Top abstract-reasoning score on ARC-AGI-2 (85.0), ahead of Opus 4.7 and Gemini 3.1 Pro
Natively unified multimodal architecture (text, image, audio, video)

Limitations

Premium pricing ($5/M input, $30/M output)
Trails Claude Opus 4.7 on SWE-bench Pro (58.6 vs 64.3) for in-codebase resolution
OpenAI dropped legacy benchmarks (MMLU, MATH-500, HumanEval) — direct comparison with older models is harder

Resources

Official Documentation

This model may use your data for training

Similar Models

Claude

Claude

DeepSeek

DeepSeek