The definitive ranking of the top AI models in 2026. Our composite scoring system evaluates 343+ models across performance benchmarks, pricing, context window, capabilities, and recency. Rankings update hourly with live data.
Claude Fable 5 is a Mythos-class model from Anthropic, built for autonomous knowledge work and coding. It supports text, image, and file inputs with text output, with reasoning support and...
Fast-mode variant of [Opus 4.7](/anthropic/claude-opus-4.7) - identical capabilities with higher output speed at premium 6x pricing. Learn more in Anthropic's docs: https://platform.claude.com/docs/en/build-with-claude/fast-mode
Opus 4.7 is the next generation of Anthropic's Opus family, built for long-running, asynchronous agents. Building on the coding and agentic strengths of Opus 4.6, it delivers stronger performance on...
Fast-mode variant of [Opus 4.8](/anthropic/claude-opus-4.8) - identical capabilities with higher output speed at 2x pricing relative to regular Opus 4.8. Learn more in Anthropic's docs: https://platform.claude.com/docs/en/build-with-claude/fast-mode
Claude Opus 4.8 is Anthropic's most capable generally available model in the Opus family. It supports text, image, and file inputs with text output, with reasoning support and a 1M-token...
GPT-5.5 is OpenAI’s frontier model designed for complex professional workloads, building on GPT-5.4 with stronger reasoning, higher reliability, and improved token efficiency on hard tasks. It features a 1M+ token...
Gemini 3.1 Pro Preview Custom Tools is a variant of Gemini 3.1 Pro that improves tool selection behavior by preventing overuse of a general bash tool when more efficient third-party...
Gemini 3.1 Pro Preview is Google’s frontier reasoning model, delivering enhanced software engineering performance, improved agentic reliability, and more efficient token usage across complex workflows. Building on the multimodal foundation...
GPT-5.4 Pro is OpenAI's most advanced model, building on GPT-5.4's unified architecture with enhanced reasoning capabilities for complex, high-stakes tasks. It features a 1M+ token context window (922K input, 128K...
GPT-5.4 is OpenAI’s latest frontier model, unifying the Codex and GPT lines into a single system. It features a 1M+ token context window (922K input, 128K output) with support for...
Our top picks across different use cases and requirements for 2026.
Anthropic
DeepSeek
Anthropic
Anthropic
| # | Model | Score |
|---|---|---|
| 1 | Claude Fable 5Anthropic | 97 |
| 2 | Claude Opus 4.7 (Fast)Anthropic | 95 |
| 3 | Claude Opus 4.7Anthropic | 95 |
| 4 | Claude Opus 4.8 (Fast)Anthropic | 94 |
| 5 | Claude Opus 4.8Anthropic | 94 |
| 6 | GPT-5.5OpenAI | 92 |
| 7 | Gemini 3.1 Pro Preview Custom ToolsGoogle | 92 |
| 8 | Gemini 3.1 Pro PreviewGoogle | 92 |
| 9 | GPT-5.4 ProOpenAI | 92 |
| 10 | GPT-5.4OpenAI | 92 |
| 11 | GPT-5.5 ProOpenAI | 90 |
| 12 | GPT-5.2-CodexOpenAI | 90 |
| 13 | GPT-5.2 ProOpenAI | 90 |
| 14 | GPT-5.2OpenAI | 90 |
| 15 | Claude Opus 4.6 (Fast)Anthropic | 90 |
| 16 | Claude Opus 4.6Anthropic | 90 |
| 17 | Grok 4.20xAI | 88 |
| 18 | GPT-5.3-CodexOpenAI | 88 |
| 19 | GPT-5 ProOpenAI | 88 |
| 20 | GPT-5 CodexOpenAI | 88 |
| 21 | GPT-5OpenAI | 88 |
| 22 | Gemini 3 Flash PreviewGoogle | 88 |
| 23 | Grok 4.20 Multi-AgentxAI | 87 |
| 24 | GPT-5.1-Codex-MaxOpenAI | 87 |
| 25 | GPT-5.1OpenAI | 87 |
| 26 | GPT-5.1-CodexOpenAI | 87 |
| 27 | GPT-5.1-Codex-MiniOpenAI | 87 |
| 28 | GPT-5.3 ChatOpenAI | 87 |
| 29 | o3 Deep ResearchOpenAI | 86 |
| 30 | o3 ProOpenAI | 86 |
119 models have been released in 2026 so far. Here are the latest arrivals.
| Model | Score |
|---|---|
| Fugu Ultrasakana | — |
| Nano Banana 2 (Gemini 3.1 Flash Image)Google | — |
| Nano Banana Pro (Gemini 3 Pro Image)Google | — |
| North Mini Code (free)Cohere | — |
| GLM 5.2Zhipu AI | — |
| Kimi K2.7 CodeMoonshot AI | — |
| Claude Fable Latest~anthropic | — |
| Claude Fable 5Anthropic | 97 |
| Nemotron 3.5 Content Safety (free)NVIDIA | — |
| Nemotron 3 Ultra (free)NVIDIA | — |
| Nemotron 3 UltraNVIDIA | — |
| Qwen3.7 PlusAlibaba | — |
| MiniMax M3MiniMax | — |
| Step 3.7 FlashStepFun | — |
| Claude Opus 4.8 (Fast)Anthropic | 94 |
| Claude Opus 4.8Anthropic | 94 |
| Qwen3.7 MaxAlibaba | — |
| Grok Build 0.1xAI | — |
| Gemini 3.5 FlashGoogle | — |
| Claude Opus 4.7 (Fast)Anthropic | 95 |
Every model receives a score from 0 to 100, driven primarily by benchmark performance (90%) from MMLU, GPQA, HumanEval, SWE-bench, and 15+ standardized evaluations. Capabilities and context window serve as tiebreakers (10%).
Rankings update hourly from live API data. We track pricing changes, new model releases, and capability updates across all major providers. No stale benchmarks or manual curation.
We evaluate 7 core capabilities: vision, function calling, streaming, JSON mode, reasoning, web search, and image output. Models that support more capabilities score higher on versatility.
Price is not the only factor. We balance cost against capability to surface the best value at every price point -- from free open-source models to premium frontier models.
Which AI providers dominate the top 30 in 2026.
| Provider | In Top 30 |
|---|---|
| OpenAI | 18 |
| Anthropic | 7 |
| 3 | |
| xAI | 2 |
Dive deeper into specific categories, compare models head-to-head, or find the right model for your use case.
The best AI model depends on your use case. For coding, models with strong SWE-bench scores lead. For general reasoning, high Arena Elo models excel. For budget-friendly options, open-source models offer excellent performance at no cost. Our leaderboard ranks all 290+ models across multiple dimensions.
We use a composite scoring system that weighs benchmark performance (90%) from MMLU, GPQA, HumanEval, SWE-bench, and 15+ standardized evaluations, with capabilities and context window as tiebreakers (10%). This balanced approach ensures no single factor dominates the ranking.
Check our coding leaderboard for the latest rankings. Top coding models are evaluated on SWE-bench, HumanEval, and real-world coding tasks. The ranking updates hourly as new models are released and benchmarks are refreshed.