300 models ranked for government and public sector. Scored with bonuses for reasoning (policy analysis), large context (regulations), JSON mode (structured data), function calling, and open-source (data sovereignty).
| # | Model | Score |
|---|---|---|
| 1 | GPT-5.4 ProOpenAI | 94 |
| 2 | GPT-5.4OpenAI | 94 |
| 3 | GPT-5.4 MiniOpenAI | 93 |
| 4 | GPT-5.2 ProOpenAI | 93 |
| 5 | GPT-5.2OpenAI | 93 |
| 6 | Claude Opus 4.6Anthropic | 92 |
| 7 | GPT-5 ProOpenAI | 92 |
| 8 | o3 Deep ResearchOpenAI | 92 |
| 9 | Claude Opus 4.5Anthropic | 90 |
| 10 | GPT-5OpenAI | 90 |
| 11 | Gemini 3 Flash PreviewGoogle | 89 |
| 12 | Claude Sonnet 4.6Anthropic | 89 |
| 13 | Claude Sonnet 4.5Anthropic | 89 |
| 14 | o3 ProOpenAI | 88 |
| 15 | Qwen3.5-9BAlibaba | 85 |
| 16 | Kimi K2.5Moonshot AI | 85 |
| 17 | Qwen3 VL 8B ThinkingAlibaba | 85 |
| 18 | Qwen3 VL 30B A3B ThinkingAlibaba | 85 |
| 19 | Grok 4.1 FastxAI | 87 |
| 20 | Nemotron 3 Super (free)NVIDIA | 84 |
| 21 | Grok 4.20 BetaxAI | 86 |
| 22 | Grok 4xAI | 86 |
| 23 | Gemini 3.1 Pro PreviewGoogle | 86 |
| 24 | o3OpenAI | 86 |
| 25 | MiniMax M2.5 (free)MiniMax | 83 |
| 26 | GPT-5.1OpenAI | 85 |
| 27 | MiMo-V2-OmniXiaomi | 85 |
| 28 | MiMo-V2-ProXiaomi | 85 |
| 29 | MiniMax M2.7MiniMax | 83 |
| 30 | GPT-5.4 NanoOpenAI | 85 |
Analyze legislation, evaluate policy impacts, and draft regulatory documents. Reasoning models assess complex policy trade-offs and compliance implications.
Automate form processing, answer public inquiries, and streamline applications. JSON mode produces structured data for government information systems.
Extract data from permits, licenses, and legal documents. Large context handles full regulatory frameworks for comprehensive document analysis.
Open-source models enable on-premise deployment for sensitive government data. Self-hosted options ensure compliance with data residency requirements.
Based on our composite scoring updated hourly, the top-ranked models are shown at the top of this page. Rankings consider benchmarks, pricing, capabilities, and community adoption.
Yes, several models listed on this page offer free tiers or are fully open-source. Look for models marked as Free in the pricing column above.
We use a composite scoring system combining benchmark performance, capability matching, pricing, context window size, and community adoption. Scores are updated hourly.
Rankings refresh every hour using real-time data from benchmarks, API testing, and community metrics. The data shown always reflects the most current performance.