The top AI models for every use case, ranked by our composite scoring system. Covering 367+ models across 55+ providers. Data refreshed hourly from live benchmarks, pricing, and capabilities.
by OpenAI - 1.1M context, $180.00/1M output
Models with reasoning capabilities, function calling, and top benchmark scores for code generation.
High-quality language models with streaming support, large context windows, and strong generation.
Models with dedicated reasoning capabilities for complex problem-solving and logical tasks.
The cheapest models that still deliver strong quality. Maximum performance per dollar.
Top-performing open-weight models you can self-host, fine-tune, and deploy without vendor lock-in.
Dedicated image generation models for creating visuals, art, and design assets from text prompts.
The highest-scoring models across all categories, ranked by composite score.
Our scoring system is benchmark-driven: benchmark performance (90%) from Arena Elo ratings, MMLU, GPQA, HumanEval, SWE-bench, and 15+ standardized evaluations, plus capabilities (5%) and context window (5%) as tiebreakers. Scores range from 0 to 100.
Benchmark scores are aggregated from multiple independent sources including head-to-head Arena evaluations, academic leaderboards, and curated official results. Models without benchmark data are capped at a score of 40, ensuring empirically evaluated models always rank higher.
Data is aggregated from multiple live API sources, covering 367+ models from 55+ providers. Scores refresh hourly so rankings always reflect the latest model releases and pricing changes.
We rank 367+ models using a composite scoring system that weighs benchmarks (90%), capabilities (5%), and context window (5%). Each use case category - coding, writing, math, image generation - gets separate rankings so the top model varies by task. Currently GPT-5.4 Pro by OpenAI leads the overall rankings with a score of 92.
GPT-5.4 Pro by OpenAI leads our coding rankings with a score of 92. It excels at code generation thanks to its reasoning capabilities and 1.1M context window. Other strong coding models include GPT-5.4 and GPT-5.2 Pro.
Qwen3 235B A22B Instruct 2507 by Alibaba offers excellent value at just $0.100 per million output tokens while maintaining a quality score of 65. For truly free options, several models from providers like Google and Meta are available at zero cost.
Open source models have closed the gap significantly. Gemma 4 31B (free) scores 81, competitive with many proprietary options. Models from DeepSeek, Meta (Llama), and Alibaba (Qwen) now rival GPT-4o and Claude on many benchmarks. The main advantages of open source are self-hosting flexibility, fine-tuning, no vendor lock-in, and often lower API costs. Proprietary models like GPT-4o and Claude still lead on some enterprise features and ecosystem integrations.
Dive deeper into rankings, compare models head-to-head, or filter by price, category, and capabilities.