HomeNewsAI Models
AI ModelsTech Radar

OpenAI Unveils GPT-5 — Native Multimodal with 2M Token Context

OpenAI's latest flagship model features a 2 million token context window, real-time vision, and significantly improved reasoning across code, math, and science.

OpenAI BlogMay 17, 2026

OpenAI officially launched GPT-5, its most capable model to date, introducing a 2 million token context window, native real-time video and audio understanding, and a step-change improvement in multi-step reasoning tasks. The model is available via API and in ChatGPT Pro plans immediately.

On the MMLU benchmark GPT-5 scores 94.2%, surpassing the previous best. More significantly, on real-world agentic evaluations — tasks that require planning, tool use, and multi-step execution — the model demonstrates markedly more reliable and accurate performance than its predecessors.

Key capabilities

  • 2 million token context window — roughly 1,500 pages of text in a single prompt
  • Native real-time vision: can process live video streams, not just static images
  • Improved code generation: scores 87.4% on SWE-bench Verified
  • Better instruction following on complex, multi-constraint tasks
  • Reduced hallucination rate across factual and mathematical tasks
GPT-5 represents the clearest signal yet that scaling continues to yield meaningful capability improvements. The 2M context window alone unlocks entire categories of enterprise use cases that were not previously feasible.

Pricing starts at $15 per million input tokens and $60 per million output tokens at the standard tier, with a cached input rate of $3.75. OpenAI also announced a new Realtime API tier that allows sub-300ms latency audio and video interaction.