300 models ranked for journalism and news. Scored with bonuses for web search (fact-checking), large context (source analysis), large output (long-form articles), streaming, and reasoning.
| # | Model | Score |
|---|---|---|
| 1 | GPT-5.4 ProOpenAI | 94 |
| 2 | GPT-5.4OpenAI | 94 |
| 3 | GPT-5.4 MiniOpenAI | 93 |
| 4 | GPT-5.2 ProOpenAI | 93 |
| 5 | GPT-5.2OpenAI | 93 |
| 6 | Claude Opus 4.6Anthropic | 92 |
| 7 | GPT-5 ProOpenAI | 92 |
| 8 | o3 Deep ResearchOpenAI | 92 |
| 9 | Claude Opus 4.5Anthropic | 90 |
| 10 | GPT-5OpenAI | 90 |
| 11 | Claude Sonnet 4.6Anthropic | 89 |
| 12 | Claude Sonnet 4.5Anthropic | 89 |
| 13 | o3 ProOpenAI | 88 |
| 14 | Grok 4.1 FastxAI | 87 |
| 15 | o3OpenAI | 86 |
| 16 | GPT-5.1OpenAI | 85 |
| 17 | GPT-5.4 NanoOpenAI | 85 |
| 18 | GPT-5.3-CodexOpenAI | 85 |
| 19 | GPT-5.2-CodexOpenAI | 85 |
| 20 | GPT-5.1-Codex-MaxOpenAI | 85 |
| 21 | o4 Mini Deep ResearchOpenAI | 85 |
| 22 | o4 Mini HighOpenAI | 85 |
| 23 | o4 MiniOpenAI | 84 |
| 24 | Grok 4 FastxAI | 83 |
| 25 | GPT-5.3 ChatOpenAI | 85 |
| 26 | GPT-5.1 ChatOpenAI | 85 |
| 27 | Claude Haiku 4.5Anthropic | 83 |
| 28 | Grok 4.20 BetaxAI | 86 |
| 29 | Grok 4xAI | 86 |
| 30 | Sonar Pro SearchPerplexity | 85 |
Web search models cross-reference claims against multiple sources in real-time. Reasoning models evaluate conflicting information and identify potential misinformation patterns.
Large context windows process entire court filings, financial reports, and government documents. Web search pulls current context to supplement stored knowledge.
Large output models produce long-form investigative pieces, feature articles, and multi-part series. Streaming delivers real-time drafts for deadline-driven newsrooms.
Models analyze datasets, spot trends, and generate data-driven story angles. JSON mode outputs structured findings that integrate with visualization tools and CMS platforms.
Based on our composite scoring updated hourly, the top-ranked models are shown at the top of this page. Rankings consider benchmarks, pricing, capabilities, and community adoption.
Yes, several models listed on this page offer free tiers or are fully open-source. Look for models marked as Free in the pricing column above.
We use a composite scoring system combining benchmark performance, capability matching, pricing, context window size, and community adoption. Scores are updated hourly.
Rankings refresh every hour using real-time data from benchmarks, API testing, and community metrics. The data shown always reflects the most current performance.