Warp is an AI-powered terminal with built-in command suggestions and natural language to shell translation. Fast, cheap models with strong instruction following work best.
Best Models for Warp
Top 15 by tool-optimized score
Scored by: benchmark performance (90%) from MMLU, GPQA, HumanEval, SWE-bench, and 15+ standardized evaluations, with capabilities and context as tiebreakers (10%).
| # | Model | Score | Output $/M |
|---|---|---|---|
| 1 | DeepSeek V4 Pro Arena Elo: 1463 | 89 | $0.870 |
| 2 | Grok 4.1 Fast Arena Elo: 1467 | 89 | $0.500 |
| 3 | Gemma 4 31B Arena Elo: 1451 | 88 | $0.380 |
| 4 | GLM 5.1 Arena Elo: 1471 | 87 | $3.50 |
| 5 | Gemma 4 26B A4B Arena Elo: 1438 | 87 | $0.330 |
| 6 | DeepSeek V4 Flash Arena Elo: 1433 | 86 | $0.280 |
| 7 | MiMo-V2.5-Pro Arena Elo: 1464 | 86 | $3.00 |
| 8 | Kimi K2.6 Arena Elo: 1462 | 86 | $3.50 |
| 9 | Gemini 3.1 Pro Preview Arena Elo: 1494 | 86 | $12.00 |
| 10 | DeepSeek V3.2 Arena Elo: 1424 | 86 | $0.378 |
| 11 | DeepSeek V3.2 Exp Arena Elo: 1423 | 86 | $0.410 |
| 12 | Llama 3.3 70B Instruct HumanEval: 88.4% | 86 | $0.320 |
| 13 | GPT-4o-mini HumanEval: 87.2% | 86 | $0.600 |
| 14 | Grok 4.3 Arena Elo: 1455 | 85 | $2.50 |
| 15 | Hy3 preview Arena Elo: 1418 | 85 | $0.260 |
| 16 | Gemma 4 31B (free) | 85 | Free |
| 17 | Qwen3.6 Plus Arena Elo: 1448 | 85 | $1.95 |
| 18 | MiMo-V2-Pro Arena Elo: 1447 | 85 | $3.00 |
| 19 | Qwen3.5 397B A17B Arena Elo: 1446 | 85 | $2.34 |
| 20 | GLM 5 Arena Elo: 1457 | 85 | $1.92 |
| 21 | Qwen3 VL 235B A22B Instruct Arena Elo: 1415 | 85 | $0.880 |
| 22 | DeepSeek V3.1 Terminus Arena Elo: 1416 | 85 | $0.950 |
| 23 | Grok 4 Fast Arena Elo: 1421 | 85 | $0.500 |
| 24 | DeepSeek V3.1 Arena Elo: 1418 | 85 | $0.750 |
| 25 | Gemini 3.1 Flash Lite Preview Arena Elo: 1438 | 84 | $1.50 |
| 26 | Qwen3.5-Flash Arena Elo: 1398 | 84 | $0.260 |
| 27 | MiniMax M2.5 (free) | 84 | Free |
| 28 | GLM 4.7 Arena Elo: 1443 | 84 | $1.75 |
| 29 | GPT-5.2 Chat Arena Elo: 1477 | 84 | $14.00 |
| 30 | DeepSeek V3 0324 HumanEval: 84.5% | 84 | $0.770 |
| 31 | MiMo-V2.5 Arena Elo: 1423 | 83 | $2.00 |
| 32 | Grok 4.20 | 83 | $2.50 |
| 33 | Step 3.5 Flash Arena Elo: 1393 | 83 | $0.300 |
| 34 | GPT-5.1-Codex-Mini | 83 | $2.00 |
| 35 | GLM 4.6 Arena Elo: 1426 | 83 | $1.90 |
| 36 | Claude 3.5 Haiku HumanEval: 88.1% | 83 | $4.00 |
| 37 | Qwen3.6 Max Preview Arena Elo: 1457 | 82 | $6.24 |
| 38 | Claude Opus 4.7 Arena Elo: 1491 | 82 | $25.00 |
| 39 | Gemma 4 26B A4B (free) | 82 | Free |
| 40 | Trinity Large Thinking Arena Elo: 1380 | 82 | $0.850 |
| 41 | Qwen3.5-122B-A10B Arena Elo: 1418 | 82 | $2.08 |
| 42 | Trinity Large Preview Arena Elo: 1375 | 82 | $0.450 |
| 43 | GLM 4.6V Arena Elo: 1378 | 82 | $0.900 |
| 44 | Gemini 2.5 Flash Lite Preview 09-2025 | 82 | $0.400 |
| 45 | GLM 4.5 Arena Elo: 1411 | 82 | $2.20 |
| 46 | Gemini 2.5 Flash Lite | 82 | $0.400 |
| 47 | Llama 3.1 70B Instruct HumanEval: 80.5% | 82 | $0.400 |
| 48 | Mistral Large HumanEval: 92% | 82 | $6.00 |
| 49 | GPT-5.5 Arena Elo: 1475 | 81 | $30.00 |
| 50 | MiniMax M2.7 Arena Elo: 1407 | 81 | $1.20 |
Based on our analysis of coding benchmarks, capability matching, and pricing, DeepSeek V4 Pro currently ranks #1 for Warp. Rankings are rebuilt as benchmark, pricing, and provider data refresh.
We score models using benchmark performance (90%) from LMArena, HumanEval, SWE-bench, MMLU, and 15+ standardized evaluations. Capabilities and context serve as tiebreakers (10%). Only models with the capabilities Warp needs are included in the tool-specific rankings.
We currently track 341 AI models compatible with Warp. This includes models from OpenAI, Anthropic, Google, DeepSeek, and other providers accessible via API.
Many open-source models are compatible with Warp through API providers like OpenRouter, Together AI, and Groq. Check our rankings to see which open-source models perform best.
Rankings refresh whenever the underlying benchmark, pricing, and catalog sources refresh. That means some signals update faster than others, and the page reflects the latest verified source data available.