Compare context windows, output capacity, and cost efficiency across 343+ models. Data sourced live from upstream provider APIs.
Note: This page shows capacity specs and pricing aggregated from upstream provider APIs. Real-world latency and tokens-per-second vary by load, prompt length, and provider infrastructure. For speed benchmarks, see Artificial Analysis or the model provider's own documentation.
Largest Context
10M
Llama 4 Scout
Largest Output
1.0M
MiniMax-01
Cheapest (non-free)
$0.01/1M in
Ling-2.6-flash
Best Output Ratio
100%
GPT-3.5 Turbo (older v0613)
| Model | Provider | Context Window | Max Output | Output Ratio | Input $/1M | Output $/1M | Efficiency (derived) | Capabilities |
|---|---|---|---|---|---|---|---|---|
| Llama 4 Scout | Meta | 10M | 16K | 0% | $0.10 | $0.30 | 100 | |
| Grok 4.20 Multi-Agent | xAI | 2M | — | — | $1.25 | $2.50 | 47 | |
| Grok 4.20 | xAI | 2M | — | — | $1.25 | $2.50 | 47 | |
| OpenAI GPT Latest | ~openai | 1.1M | 128K | 12% | $5.00 | $30.00 | 27 | |
| GPT-5.5 Pro | OpenAI | 1.1M | 128K | 12% | $30.00 | $180.00 | 16 | |
| GPT-5.5 | OpenAI | 1.1M | 128K | 12% | $5.00 | $30.00 | 27 | |
| GPT-5.4 Pro | OpenAI | 1.1M | 128K | 12% | $30.00 | $180.00 | 16 | |
| GPT-5.4 | OpenAI | 1.1M | 128K | 12% | $2.50 | $15.00 | 35 | |
| Gemini 3.1 Pro Preview Custom Tools | 1.0M | 66K | 6% | $2.00 | $12.00 | 38 | ||
| GLM 5.2 | Zhipu AI | 1.0M | 33K | 3% | $0.95 | $3.00 | 50 | |
| MiniMax M3 | MiniMax | 1.0M | 512K | 49% | $0.30 | $1.20 | 71 | |
| Gemini 3.5 Flash | 1.0M | 66K | 6% | $1.50 | $9.00 | 42 | ||
| Gemini 3.1 Flash Lite | 1.0M | 66K | 6% | $0.25 | $1.50 | 74 | ||
| Google Gemini Pro Latest | 1.0M | 66K | 6% | $2.00 | $12.00 | 38 | ||
| Google Gemini Flash Latest | 1.0M | 66K | 6% | $1.50 | $9.00 | 42 | ||
| DeepSeek V4 Pro | DeepSeek | 1.0M | 384K | 37% | $0.43 | $0.87 | 64 | |
| DeepSeek V4 Flash | DeepSeek | 1.0M | 66K | 6% | $0.09 | $0.18 | 87 | |
| MiMo-V2.5-Pro | Xiaomi | 1.0M | 131K | 13% | $0.43 | $0.87 | 64 | |
| MiMo-V2.5 | Xiaomi | 1.0M | — | — | $0.10 | $0.28 | 86 | |
| Lyria 3 Pro Preview | 1.0M | 66K | 6% | Free | Free | 98 | ||
| Lyria 3 Clip Preview | 1.0M | 66K | 6% | Free | Free | 98 | ||
| Gemini 3.1 Flash Lite Preview | 1.0M | 66K | 6% | $0.25 | $1.50 | 74 | ||
| Gemini 3.1 Pro Preview | 1.0M | 66K | 6% | $2.00 | $12.00 | 38 | ||
| Gemini 3 Flash Preview | 1.0M | 66K | 6% | $0.50 | $3.00 | 62 | ||
| Gemini 2.5 Flash Lite Preview 09-2025 | 1.0M | 66K | 6% | $0.10 | $0.40 | 86 | ||
| Qwen3 Coder 480B A35B (free) | Alibaba | 1.0M | 262K | 25% | Free | Free | 98 | |
| Qwen3 Coder 480B A35B | Alibaba | 1.0M | 66K | 6% | $0.22 | $1.80 | 76 | |
| Gemini 2.5 Flash Lite | 1.0M | 66K | 6% | $0.10 | $0.40 | 86 | ||
| Gemini 2.5 Flash | 1.0M | 66K | 6% | $0.30 | $2.50 | 71 | ||
| Gemini 2.5 Pro | 1.0M | 66K | 6% | $1.25 | $10.00 | 45 | ||
| Gemini 2.5 Pro Preview 06-05 | 1.0M | 66K | 6% | $1.25 | $10.00 | 45 | ||
| Gemini 2.5 Pro Preview 05-06 | 1.0M | 66K | 6% | $1.25 | $10.00 | 45 | ||
| Llama 4 Maverick | Meta | 1.0M | 16K | 2% | $0.15 | $0.60 | 81 | |
| GPT-4.1 | OpenAI | 1.0M | — | — | $2.00 | $8.00 | 38 | |
| GPT-4.1 Mini | OpenAI | 1.0M | 33K | 3% | $0.40 | $1.60 | 66 | |
| GPT-4.1 Nano | OpenAI | 1.0M | 33K | 3% | $0.10 | $0.40 | 86 | |
| Palmyra X5 | Writer | 1.0M | 8K | 1% | $0.60 | $6.00 | 58 | |
| MiniMax-01 | MiniMax | 1.0M | 1.0M | 100% | $0.20 | $1.10 | 77 | |
| Fugu Ultra | sakana | 1M | 128K | 13% | $5.00 | $30.00 | 27 | |
| Claude Fable Latest | ~anthropic | 1M | 128K | 13% | $10.00 | $50.00 | 22 | |
| Claude Fable 5 | Anthropic | 1M | 128K | 13% | $10.00 | $50.00 | 22 | |
| Nemotron 3 Ultra (free) | NVIDIA | 1M | 66K | 7% | Free | Free | 98 | |
| Nemotron 3 Ultra | NVIDIA | 1M | 16K | 2% | $0.50 | $2.20 | 62 | |
| Qwen3.7 Plus | Alibaba | 1M | 66K | 7% | $0.32 | $1.28 | 70 | |
| Claude Opus 4.8 (Fast) | Anthropic | 1M | 128K | 13% | $10.00 | $50.00 | 22 | |
| Claude Opus 4.8 | Anthropic | 1M | 128K | 13% | $5.00 | $25.00 | 27 | |
| Qwen3.7 Max | Alibaba | 1M | 66K | 7% | $1.25 | $3.75 | 45 | |
| Claude Opus 4.7 (Fast) | Anthropic | 1M | 128K | 13% | $30.00 | $150.00 | 16 | |
| Grok 4.3 | xAI | 1M | — | — | $1.25 | $2.50 | 45 | |
| Anthropic Claude Sonnet Latest | ~anthropic | 1M | 128K | 13% | $3.00 | $15.00 | 33 | |
| Qwen3.5 Plus 2026-04-20 | Alibaba | 1M | 66K | 7% | $0.30 | $1.80 | 71 | |
| Qwen3.6 Flash | Alibaba | 1M | 66K | 7% | $0.19 | $1.13 | 78 | |
| Claude Opus Latest | ~anthropic | 1M | 128K | 13% | $5.00 | $25.00 | 27 | |
| Claude Opus 4.7 | Anthropic | 1M | 128K | 13% | $5.00 | $25.00 | 27 | |
| Claude Opus 4.6 (Fast) | Anthropic | 1M | 128K | 13% | $30.00 | $150.00 | 16 | |
| Qwen3.6 Plus | Alibaba | 1M | 66K | 7% | $0.33 | $1.95 | 69 | |
| Nemotron 3 Super (free) | NVIDIA | 1M | 262K | 26% | Free | Free | 98 | |
| Nemotron 3 Super | NVIDIA | 1M | — | — | $0.09 | $0.45 | 87 | |
| Qwen3.5-Flash | Alibaba | 1M | 66K | 7% | $0.07 | $0.26 | 89 | |
| Claude Sonnet 4.6 | Anthropic | 1M | 128K | 13% | $3.00 | $15.00 | 33 | |
| Qwen3.5 Plus 2026-02-15 | Alibaba | 1M | 66K | 7% | $0.26 | $1.56 | 73 | |
| Claude Opus 4.6 | Anthropic | 1M | 128K | 13% | $5.00 | $25.00 | 27 | |
| Nova 2 Lite | Amazon | 1M | 66K | 7% | $0.30 | $2.50 | 71 | |
| Nova Premier 1.0 | Amazon | 1M | 32K | 3% | $2.50 | $12.50 | 35 | |
| Claude Sonnet 4.5 | Anthropic | 1M | 64K | 6% | $3.00 | $15.00 | 33 | |
| Qwen3 Coder Plus | Alibaba | 1M | 66K | 7% | $0.65 | $3.25 | 57 | |
| Qwen3 Coder Flash | Alibaba | 1M | 66K | 7% | $0.20 | $0.97 | 78 | |
| Qwen Plus 0728 (thinking) | Alibaba | 1M | 33K | 3% | $0.26 | $0.78 | 73 | |
| Qwen Plus 0728 | Alibaba | 1M | 33K | 3% | $0.26 | $0.78 | 73 | |
| MiniMax M1 | MiniMax | 1M | 40K | 4% | $0.40 | $2.20 | 66 | |
| Claude Sonnet 4 | Anthropic | 1M | 64K | 6% | $3.00 | $15.00 | 33 | |
| Qwen-Plus | Alibaba | 1M | 33K | 3% | $0.26 | $0.78 | 73 | |
| GPT Chat Latest | OpenAI | 400K | 128K | 32% | $5.00 | $30.00 | 25 | |
| OpenAI GPT Mini Latest | ~openai | 400K | 128K | 32% | $0.75 | $4.50 | 50 | |
| GPT-5.4 Nano | OpenAI | 400K | 128K | 32% | $0.20 | $1.25 | 72 | |
| GPT-5.4 Mini | OpenAI | 400K | 128K | 32% | $0.75 | $4.50 | 50 | |
| GPT-5.3-Codex | OpenAI | 400K | 128K | 32% | $1.75 | $14.00 | 37 | |
| GPT-5.2-Codex | OpenAI | 400K | 128K | 32% | $1.75 | $14.00 | 37 | |
| GPT-5.2 Pro | OpenAI | 400K | 128K | 32% | $21.00 | $168.00 | 17 | |
| GPT-5.2 | OpenAI | 400K | 128K | 32% | $1.75 | $14.00 | 37 | |
| GPT-5.1-Codex-Max | OpenAI | 400K | 128K | 32% | $1.25 | $10.00 | 42 | |
| GPT-5.1 | OpenAI | 400K | 128K | 32% | $1.25 | $10.00 | 42 | |
| GPT-5.1-Codex | OpenAI | 400K | 128K | 32% | $1.25 | $10.00 | 42 | |
| GPT-5.1-Codex-Mini | OpenAI | 400K | 100K | 25% | $0.25 | $2.00 | 69 | |
| GPT-5 Pro | OpenAI | 400K | 128K | 32% | $15.00 | $120.00 | 18 | |
| GPT-5 Codex | OpenAI | 400K | 128K | 32% | $1.25 | $10.00 | 42 | |
| GPT-5 | OpenAI | 400K | 128K | 32% | $1.25 | $10.00 | 42 | |
| GPT-5 Mini | OpenAI | 400K | 128K | 32% | $0.25 | $2.00 | 69 | |
| GPT-5 Nano | OpenAI | 400K | — | — | $0.05 | $0.40 | 85 | |
| Nova Lite 1.0 | Amazon | 300K | 5K | 2% | $0.06 | $0.24 | 82 | |
| Nova Pro 1.0 | Amazon | 300K | 5K | 2% | $0.80 | $3.20 | 48 | |
| Kimi K2.7 Code | Moonshot AI | 262K | 16K | 6% | $0.74 | $3.50 | 49 | |
| Ring-2.6-1T | inclusionai | 262K | 66K | 25% | $0.07 | $0.63 | 80 | |
| Mistral Medium 3.5 | Mistral AI | 262K | — | — | $1.50 | $7.50 | 38 | |
| Laguna XS.2 (free) | poolside | 262K | 33K | 13% | Free | Free | 88 | |
| Laguna XS.2 | poolside | 262K | 33K | 13% | $0.10 | $0.20 | 77 | |
| Laguna M.1 (free) | poolside | 262K | 33K | 13% | Free | Free | 88 | |
| Laguna M.1 | poolside | 262K | 33K | 13% | $0.20 | $0.40 | 70 | |
| MoonshotAI Kimi Latest | ~moonshotai | 262K | 262K | 100% | $0.66 | $3.41 | 51 | |
| Qwen3.6 35B A3B | Alibaba | 262K | 262K | 100% | $0.14 | $1.00 | 74 | |
| Qwen3.6 Max Preview | Alibaba | 262K | 66K | 25% | $1.04 | $6.24 | 43 | |
| Qwen3.6 27B | Alibaba | 262K | 262K | 100% | $0.29 | $3.17 | 64 | |
| Ling-2.6-1T | inclusionai | 262K | 33K | 13% | $0.07 | $0.63 | 80 | |
| Hy3 preview | Tencent | 262K | — | — | $0.06 | $0.21 | 81 | |
| Ling-2.6-flash | inclusionai | 262K | 33K | 13% | $0.01 | $0.03 | 87 | |
| Kimi K2.6 | Moonshot AI | 262K | 262K | 100% | $0.66 | $3.41 | 51 | |
| Gemma 4 26B A4B (free) | 262K | 33K | 13% | Free | Free | 88 | ||
| Gemma 4 26B A4B | 262K | — | — | $0.06 | $0.33 | 81 | ||
| Gemma 4 31B (free) | 262K | 8K | 3% | Free | Free | 88 | ||
| Gemma 4 31B | 262K | 262K | 100% | $0.12 | $0.35 | 76 | ||
| Trinity Large Thinking | arcee-ai | 262K | 80K | 31% | $0.25 | $0.80 | 67 | |
| Mistral Small 4 | Mistral AI | 262K | — | — | $0.15 | $0.60 | 73 | |
| GLM 5 Turbo | Zhipu AI | 262K | 131K | 50% | $1.20 | $4.00 | 41 | |
| Seed-2.0-Lite | ByteDance | 262K | 131K | 50% | $0.25 | $2.00 | 67 | |
| Qwen3.5-9B | Alibaba | 262K | 262K | 100% | $0.10 | $0.15 | 77 | |
| Seed-2.0-Mini | ByteDance | 262K | 131K | 50% | $0.10 | $0.40 | 77 | |
| Qwen3.5-35B-A3B | Alibaba | 262K | 262K | 100% | $0.14 | $1.00 | 74 | |
| Qwen3.5-27B | Alibaba | 262K | 66K | 25% | $0.20 | $1.56 | 70 | |
| Qwen3.5-122B-A10B | Alibaba | 262K | 262K | 100% | $0.26 | $2.08 | 66 | |
| Qwen3 Max Thinking | Alibaba | 262K | 33K | 13% | $0.78 | $3.90 | 48 | |
| Qwen3 Coder Next | Alibaba | 262K | 262K | 100% | $0.11 | $0.80 | 77 | |
| Step 3.5 Flash | StepFun | 262K | 16K | 6% | $0.09 | $0.30 | 78 | |
| Kimi K2.5 | Moonshot AI | 262K | — | — | $0.38 | $2.02 | 60 | |
| Seed 1.6 Flash | ByteDance | 262K | 33K | 13% | $0.07 | $0.30 | 80 | |
| Seed 1.6 | ByteDance | 262K | 33K | 13% | $0.25 | $2.00 | 67 | |
| Nemotron 3 Nano 30B A3B | NVIDIA | 262K | 228K | 87% | $0.05 | $0.20 | 82 | |
| Devstral 2 2512 | Mistral AI | 262K | — | — | $0.40 | $2.00 | 59 | |
| Ministral 3 14B 2512 | Mistral AI | 262K | — | — | $0.20 | $0.20 | 70 | |
| Ministral 3 8B 2512 | Mistral AI | 262K | — | — | $0.15 | $0.15 | 73 | |
| Mistral Large 3 2512 | Mistral AI | 262K | — | — | $0.50 | $1.50 | 56 | |
| Kimi K2 Thinking | Moonshot AI | 262K | 262K | 100% | $0.60 | $2.50 | 52 | |
| Qwen3 VL 32B Instruct | Alibaba | 262K | 33K | 13% | $0.10 | $0.42 | 77 | |
| Qwen3 VL 30B A3B Instruct | Alibaba | 262K | 33K | 13% | $0.13 | $0.52 | 75 | |
| Qwen3 VL 235B A22B Instruct | Alibaba | 262K | 16K | 6% | $0.20 | $0.88 | 70 | |
| Qwen3 Max | Alibaba | 262K | 33K | 13% | $0.78 | $3.90 | 48 | |
| Qwen3 Next 80B A3B Thinking | Alibaba | 262K | 33K | 13% | $0.10 | $0.78 | 78 | |
| Qwen3 Next 80B A3B Instruct (free) | Alibaba | 262K | — | — | Free | Free | 88 | |
| Qwen3 Next 80B A3B Instruct | Alibaba | 262K | 16K | 6% | $0.09 | $1.10 | 78 | |
| Kimi K2 0905 | Moonshot AI | 262K | 262K | 100% | $0.60 | $2.50 | 52 | |
| Qwen3 235B A22B Thinking 2507 | Alibaba | 262K | 262K | 100% | $0.10 | $0.10 | 77 | |
| Qwen3 235B A22B Instruct 2507 | Alibaba | 262K | 16K | 6% | $0.09 | $0.10 | 78 | |
| Falcon-H1-Arabic 34B Instruct | TII | 262K | 8K | 3% | Free | Free | 88 | |
| Falcon-H1-Arabic 7B Instruct | TII | 262K | 8K | 3% | Free | Free | 88 | |
| North Mini Code (free) | Cohere | 256K | 64K | 25% | Free | Free | 88 | |
| Step 3.7 Flash | StepFun | 256K | 256K | 100% | $0.20 | $1.15 | 70 | |
| Grok Build 0.1 | xAI | 256K | — | — | $1.00 | $2.00 | 44 | |
| Nemotron 3 Nano Omni (free) | NVIDIA | 256K | 66K | 26% | Free | Free | 88 | |
| KAT-Coder-Pro V2 | Kuaishou | 256K | 80K | 31% | $0.30 | $1.20 | 64 | |
| Qwen3.5 397B A17B | Alibaba | 256K | — | — | $0.39 | $2.45 | 60 | |
| Nemotron 3 Nano 30B A3B (free) | NVIDIA | 256K | — | — | Free | Free | 88 | |
| Qwen3 VL 8B Thinking | Alibaba | 256K | 33K | 13% | $0.12 | $1.36 | 76 | |
| Qwen3 VL 8B Instruct | Alibaba | 256K | 33K | 13% | $0.08 | $0.50 | 79 | |
| Jamba Large 1.7 | AI21 Labs | 256K | 4K | 2% | $2.00 | $8.00 | 34 | |
| Codestral 2508 | Mistral AI | 256K | — | — | $0.30 | $0.90 | 64 | |
| Command A | Cohere | 256K | 8K | 3% | $2.50 | $10.00 | 31 | |
| MiniMax M2.7 | MiniMax | 205K | 197K | 96% | $0.24 | $0.96 | 66 | |
| MiniMax M2.5 | MiniMax | 205K | 197K | 96% | $0.15 | $0.90 | 72 | |
| MiniMax M2.1 | MiniMax | 205K | 197K | 96% | $0.29 | $0.95 | 63 | |
| MiniMax M2 | MiniMax | 205K | 197K | 96% | $0.26 | $1.00 | 65 | |
| GLM 5.1 | Zhipu AI | 203K | 66K | 32% | $0.98 | $3.08 | 43 | |
| GLM 5V Turbo | Zhipu AI | 203K | 131K | 65% | $1.20 | $4.00 | 40 | |
| GLM 5 | Zhipu AI | 203K | — | — | $0.60 | $1.92 | 51 | |
| GLM 4.7 Flash | Zhipu AI | 203K | 16K | 8% | $0.06 | $0.40 | 80 | |
| GLM 4.7 | Zhipu AI | 203K | 131K | 65% | $0.40 | $1.75 | 58 | |
| GLM 4.6 | Zhipu AI | 203K | 131K | 65% | $0.43 | $1.74 | 57 | |
| Anthropic Claude Haiku Latest | ~anthropic | 200K | 64K | 32% | $1.00 | $5.00 | 43 | |
| Claude Opus 4.5 | Anthropic | 200K | 64K | 32% | $5.00 | $25.00 | 24 | |
| Sonar Pro Search | Perplexity | 200K | 8K | 4% | $3.00 | $15.00 | 29 | |
| Claude Haiku 4.5 | Anthropic | 200K | 64K | 32% | $1.00 | $5.00 | 43 | |
| o3 Deep Research | OpenAI | 200K | 100K | 50% | $10.00 | $40.00 | 19 | |
| o4 Mini Deep Research | OpenAI | 200K | 100K | 50% | $2.00 | $8.00 | 33 | |
| Claude Opus 4.1 | Anthropic | 200K | 32K | 16% | $15.00 | $75.00 | 17 | |
| o3 Pro | OpenAI | 200K | 100K | 50% | $20.00 | $80.00 | 16 | |
| Claude Opus 4 | Anthropic | 200K | 32K | 16% | $15.00 | $75.00 | 17 | |
| o4 Mini High | OpenAI | 200K | 100K | 50% | $1.10 | $4.40 | 42 | |
| o3 | OpenAI | 200K | 100K | 50% | $2.00 | $8.00 | 33 | |
| o4 Mini | OpenAI | 200K | 100K | 50% | $1.10 | $4.40 | 42 | |
| o1-pro | OpenAI | 200K | 100K | 50% | $150.00 | $600.00 | 10 | |
| Sonar Pro | Perplexity | 200K | 8K | 4% | $3.00 | $15.00 | 29 | |
| o3 Mini High | OpenAI | 200K | 100K | 50% | $1.10 | $4.40 | 42 | |
| o3 Mini | OpenAI | 200K | 100K | 50% | $1.10 | $4.40 | 42 | |
| o1 | OpenAI | 200K | 100K | 50% | $15.00 | $60.00 | 17 | |
| Claude 3 Haiku | Anthropic | 200K | 4K | 2% | $0.25 | $1.25 | 65 | |
| Composer 2 | Cursor | 200K | 66K | 33% | $0.50 | $2.50 | 54 | |
| Composer 2 Fast | Cursor | 200K | 66K | 33% | $1.50 | $7.50 | 37 | |
| DeepSeek V3.2 Exp | DeepSeek | 164K | 66K | 40% | $0.27 | $0.41 | 63 | |
| DeepSeek V3.1 Terminus | DeepSeek | 164K | 33K | 20% | $0.27 | $0.95 | 63 | |
| DeepSeek V3.1 | DeepSeek | 164K | 33K | 20% | $0.21 | $0.79 | 66 | |
| R1 0528 | DeepSeek | 164K | 33K | 20% | $0.50 | $2.15 | 53 | |
| Llama Guard 4 12B | Meta | 164K | 16K | 10% | $0.18 | $0.18 | 68 | |
| DeepSeek V3 0324 | DeepSeek | 164K | 16K | 10% | $0.20 | $0.77 | 67 | |
| R1 | DeepSeek | 164K | 16K | 10% | $0.70 | $2.50 | 48 | |
| Qwen3 Coder 30B A3B Instruct | Alibaba | 160K | 33K | 20% | $0.07 | $0.27 | 77 | |
| Qwen3 14B | Alibaba | 132K | 41K | 31% | $0.10 | $0.24 | 73 | |
| Granite 4.1 8B | IBM | 131K | 131K | 100% | $0.05 | $0.10 | 78 | |
| Aion-2.0 | aion-labs | 131K | 33K | 25% | $0.80 | $1.60 | 45 | |
| GLM 4.6V | Zhipu AI | 131K | 33K | 25% | $0.30 | $0.90 | 60 | |
| Ministral 3 3B 2512 | Mistral AI | 131K | — | — | $0.10 | $0.10 | 73 | |
| Trinity Mini | arcee-ai | 131K | 131K | 100% | $0.04 | $0.15 | 78 | |
| DeepSeek V3.2 | DeepSeek | 131K | 64K | 49% | $0.23 | $0.34 | 64 | |
| gpt-oss-safeguard-20b | OpenAI | 131K | 66K | 50% | $0.07 | $0.30 | 75 | |
| Phi 4 Mini Instruct | Microsoft | 131K | 128K | 98% | $0.08 | $0.35 | 75 | |
| Llama 3.3 Nemotron Super 49B V1.5 | NVIDIA | 131K | 16K | 13% | $0.40 | $0.40 | 56 | |
| Qwen3 VL 30B A3B Thinking | Alibaba | 131K | 33K | 25% | $0.13 | $1.56 | 71 | |
| Qwen3 VL 235B A22B Thinking | Alibaba | 131K | 33K | 25% | $0.26 | $2.60 | 62 | |
| Qwen3 30B A3B Thinking 2507 | Alibaba | 131K | 131K | 100% | $0.08 | $0.40 | 75 | |
| Mistral Medium 3.1 | Mistral AI | 131K | — | — | $0.40 | $2.00 | 56 | |
| gpt-oss-120b (free) | OpenAI | 131K | 131K | 100% | Free | Free | 83 | |
| gpt-oss-120b | OpenAI | 131K | — | — | $0.04 | $0.18 | 79 | |
| gpt-oss-20b (free) | OpenAI | 131K | 33K | 25% | Free | Free | 83 | |
| gpt-oss-20b | OpenAI | 131K | — | — | $0.03 | $0.14 | 80 | |
| Qwen3 30B A3B Instruct 2507 | Alibaba | 131K | 32K | 24% | $0.05 | $0.19 | 78 | |
| GLM 4.5 | Zhipu AI | 131K | 98K | 75% | $0.60 | $2.20 | 50 | |
| GLM 4.5 Air | Zhipu AI | 131K | 98K | 75% | $0.13 | $0.85 | 71 | |
| Kimi K2 0711 | Moonshot AI | 131K | 33K | 25% | $0.57 | $2.30 | 50 | |
| Hunyuan A13B Instruct | Tencent | 131K | 131K | 100% | $0.14 | $0.57 | 70 | |
| ERNIE 4.5 VL 424B A47B | Baidu | 131K | 16K | 12% | $0.42 | $1.25 | 55 | |
| Mistral Medium 3 | Mistral AI | 131K | — | — | $0.40 | $2.00 | 56 | |
| Virtuoso Large | arcee-ai | 131K | 64K | 49% | $0.75 | $1.20 | 46 | |
| Qwen3 30B A3B | Alibaba | 131K | 16K | 13% | $0.12 | $0.50 | 71 | |
| Qwen3 8B | Alibaba | 131K | 8K | 6% | $0.05 | $0.40 | 78 | |
| Qwen3 32B | Alibaba | 131K | 16K | 13% | $0.08 | $0.28 | 75 | |
| Qwen3 235B A22B | Alibaba | 131K | 8K | 6% | $0.45 | $1.82 | 54 | |
| Gemma 3 4B | 131K | 16K | 13% | $0.05 | $0.10 | 78 | ||
| Gemma 3 12B | 131K | 16K | 13% | $0.05 | $0.15 | 78 | ||
| Gemma 3 27B | 131K | 16K | 13% | $0.08 | $0.16 | 75 | ||
| Aion-1.0 | aion-labs | 131K | 33K | 25% | $4.00 | $8.00 | 25 | |
| Aion-1.0-Mini | aion-labs | 131K | 33K | 25% | $0.70 | $1.40 | 47 | |
| Qwen2.5 VL 72B Instruct | Alibaba | 131K | 128K | 98% | $0.80 | $1.00 | 45 | |
| DeepSeek V3 | DeepSeek | 131K | 16K | 12% | $0.20 | $0.80 | 66 | |
| Llama 3.3 70B Instruct (free) | Meta | 131K | — | — | Free | Free | 83 | |
| Llama 3.3 70B Instruct | Meta | 131K | 16K | 13% | $0.10 | $0.32 | 73 | |
| Mistral Large 2407 | Mistral AI | 131K | — | — | $2.00 | $6.00 | 32 | |
| Qwen2.5 7B Instruct | Alibaba | 131K | 33K | 25% | $0.04 | $0.10 | 79 | |
| Llama 3.2 3B Instruct (free) | Meta | 131K | — | — | Free | Free | 83 | |
| Llama 3.2 3B Instruct | Meta | 131K | 80K | 61% | $0.05 | $0.34 | 78 | |
| Llama 3.2 1B Instruct | Meta | 131K | 60K | 46% | $0.03 | $0.20 | 80 | |
| Llama 3.2 11B Vision Instruct | Meta | 131K | 16K | 13% | $0.34 | $0.34 | 58 | |
| Qwen2.5 72B Instruct | Alibaba | 131K | 16K | 13% | $0.36 | $0.40 | 58 | |
| Llama 3.1 8B Instruct | Meta | 131K | 16K | 13% | $0.02 | $0.03 | 81 | |
| Llama 3.1 70B Instruct | Meta | 131K | 16K | 13% | $0.40 | $0.40 | 56 | |
| Mistral Nemo | Mistral AI | 131K | — | — | $0.02 | $0.03 | 81 | |
| Falcon-H1-Arabic 3B Instruct | TII | 131K | 8K | 6% | Free | Free | 83 | |
| Granite 4.0 Micro | IBM | 131K | 131K | 100% | $0.02 | $0.11 | 81 | |
| Nemotron 3.5 Content Safety (free) | NVIDIA | 128K | 8K | 6% | Free | Free | 83 | |
| Mercury 2 | Inception | 128K | 50K | 39% | $0.25 | $0.75 | 63 | |
| GPT-5.3 Chat | OpenAI | 128K | 16K | 13% | $1.75 | $14.00 | 34 | |
| LFM2-24B-A2B | Liquid AI | 128K | — | — | $0.03 | $0.12 | 80 | |
| Solar Pro 3 | Upstage | 128K | — | — | $0.15 | $0.60 | 69 | |
| GPT Audio | OpenAI | 128K | 16K | 13% | $2.50 | $10.00 | 30 | |
| GPT Audio Mini | OpenAI | 128K | 16K | 13% | $0.60 | $2.40 | 49 | |
| GPT-5.2 Chat | OpenAI | 128K | 16K | 13% | $1.75 | $14.00 | 34 | |
| Cogito v2.1 671B | deepcogito | 128K | — | — | $1.25 | $1.25 | 38 | |
| GPT-5.1 Chat | OpenAI | 128K | 32K | 25% | $1.25 | $10.00 | 38 | |
| Nemotron Nano 12B 2 VL (free) | NVIDIA | 128K | 128K | 100% | Free | Free | 83 | |
| Nemotron Nano 9B V2 (free) | NVIDIA | 128K | — | — | Free | Free | 83 | |
| GPT-5 Chat | OpenAI | 128K | 16K | 13% | $1.25 | $10.00 | 38 | |
| UI-TARS 7B | ByteDance | 128K | 2K | 2% | $0.10 | $0.20 | 73 | |
| Mistral Small 3.2 24B | Mistral AI | 128K | 16K | 13% | $0.07 | $0.20 | 75 | |
| Mistral Small 3.1 24B | Mistral AI | 128K | 128K | 100% | $0.35 | $0.55 | 58 | |
| GPT-4o-mini Search Preview | OpenAI | 128K | 16K | 13% | $0.15 | $0.60 | 69 | |
| GPT-4o Search Preview | OpenAI | 128K | 16K | 13% | $2.50 | $10.00 | 30 | |
| Sonar Reasoning Pro | Perplexity | 128K | — | — | $2.00 | $8.00 | 32 | |
| Sonar Deep Research | Perplexity | 128K | — | — | $2.00 | $8.00 | 32 | |
| R1 Distill Llama 70B | DeepSeek | 128K | 8K | 6% | $0.80 | $0.80 | 45 | |
| Command R7B (12-2024) | Cohere | 128K | 4K | 3% | $0.04 | $0.15 | 79 | |
| Nova Micro 1.0 | Amazon | 128K | 5K | 4% | $0.04 | $0.14 | 79 | |
| GPT-4o (2024-11-20) | OpenAI | 128K | 16K | 13% | $2.50 | $10.00 | 30 | |
| Qwen2.5 Coder 32B Instruct | Alibaba | 128K | 33K | 26% | $0.66 | $1.00 | 48 | |
| Command R+ (08-2024) | Cohere | 128K | 4K | 3% | $2.50 | $10.00 | 30 | |
| Command R (08-2024) | Cohere | 128K | 4K | 3% | $0.15 | $0.60 | 69 | |
| GPT-4o (2024-08-06) | OpenAI | 128K | 16K | 13% | $2.50 | $10.00 | 30 | |
| GPT-4o-mini (2024-07-18) | OpenAI | 128K | 16K | 13% | $0.15 | $0.60 | 69 | |
| GPT-4o-mini | OpenAI | 128K | 16K | 13% | $0.15 | $0.60 | 69 | |
| GPT-4o (2024-05-13) | OpenAI | 128K | 4K | 3% | $5.00 | $15.00 | 23 | |
| GPT-4o | OpenAI | 128K | 16K | 13% | $2.50 | $10.00 | 30 | |
| GPT-4 Turbo | OpenAI | 128K | 4K | 3% | $10.00 | $30.00 | 19 | |
| Mistral Large | Mistral AI | 128K | — | — | $2.00 | $6.00 | 32 | |
| GPT-4 Turbo Preview | OpenAI | 128K | 4K | 3% | $10.00 | $30.00 | 19 | |
| Sonar | Perplexity | 127K | — | — | $1.00 | $1.00 | 41 | |
| MiniMax M2-her | MiniMax | 66K | 2K | 3% | $0.30 | $1.20 | 57 | |
| Olmo 3 32B Think | Allen AI | 66K | 66K | 100% | $0.15 | $0.50 | 65 | |
| GLM 4.5V | Zhipu AI | 66K | 16K | 25% | $0.60 | $1.80 | 47 | |
| Reka Flash 3 | rekaai | 66K | 66K | 100% | $0.10 | $0.20 | 69 | |
| Mixtral 8x22B Instruct | Mistral AI | 66K | — | — | $2.00 | $6.00 | 30 | |
| WizardLM-2 8x22B | Microsoft | 66K | 8K | 12% | $0.62 | $0.62 | 46 | |
| Perceptron Mk1 | perceptron | 33K | 8K | 25% | $0.15 | $1.50 | 61 | |
| LFM2.5-1.2B-Thinking (free) | Liquid AI | 33K | — | — | Free | Free | 73 | |
| LFM2.5-1.2B-Instruct (free) | Liquid AI | 33K | — | — | Free | Free | 73 | |
| Gemma 3n 4B | 33K | — | — | $0.06 | $0.12 | 68 | ||
| Coder Large | arcee-ai | 33K | — | — | $0.50 | $0.80 | 46 | |
| Saba | Mistral AI | 33K | — | — | $0.20 | $0.60 | 58 | |
| Mistral Small 3 | Mistral AI | 33K | 16K | 50% | $0.05 | $0.08 | 69 | |
| Falcon Arabic 7B Instruct | TII | 33K | 8K | 25% | Free | Free | 73 | |
| Falcon3 10B Instruct | TII | 33K | 8K | 25% | Free | Free | 73 | |
| Falcon3 7B Instruct | TII | 33K | 8K | 25% | Free | Free | 73 | |
| Falcon Mamba 7B Instruct | TII | 33K | 8K | 25% | Free | Free | 73 | |
| Voxtral Small 24B 2507 | Mistral AI | 32K | — | — | $0.10 | $0.30 | 64 | |
| GPT-3.5 Turbo 16k | OpenAI | 16K | 4K | 25% | $3.00 | $4.00 | 23 | |
| GPT-3.5 Turbo | OpenAI | 16K | 4K | 25% | $0.50 | $1.50 | 43 | |
| Reka Edge | rekaai | 16K | 16K | 100% | $0.10 | $0.10 | 60 | |
| Phi 4 | Microsoft | 16K | 16K | 100% | $0.07 | $0.14 | 62 | |
| Gemma 2 27B | 8K | 2K | 25% | $0.65 | $0.65 | 37 | ||
| Llama 3 8B Instruct | Meta | 8K | — | — | $0.14 | $0.14 | 53 | |
| GPT-4 | OpenAI | 8K | 4K | 50% | $30.00 | $60.00 | 11 | |
| Inflection 3 Productivity | Inflection | 8K | 1K | 13% | $2.50 | $10.00 | 23 | |
| Inflection 3 Pi | Inflection | 8K | 1K | 13% | $2.50 | $10.00 | 23 | |
| ALLaM 7B Instruct (preview) | HUMAIN | 4K | 4K | 100% | Free | Free | 59 | |
| ALLaM 1 13B Instruct | HUMAIN | 4K | 4K | 100% | $1.80 | $1.80 | 24 | |
| ALLaM 2 7B Instruct | HUMAIN | 4K | 4K | 100% | Free | Free | 59 | |
| ALLaM 34B | HUMAIN | 4K | 4K | 100% | Free | Free | 59 | |
| GPT-3.5 Turbo (older v0613) | OpenAI | 4K | 4K | 100% | $1.00 | $2.00 | 29 | |
| GPT-3.5 Turbo Instruct | OpenAI | 4K | 4K | 100% | $1.50 | $2.00 | 25 | |
| SWE-1.5 | Windsurf | — | — | — | Free | Free | 0 | |
| autofixer-01 | Vercel | — | — | — | Free | Free | 0 | |
| Mellum | JetBrains | — | — | — | Free | Free | 0 |
| Model | Provider | Input $/1M | Output $/1M | Capabilities |
|---|---|---|---|---|
| GPT-5 Image | OpenAI | $10.00 | $10.00 | |
| GPT-5.4 Image 2 | OpenAI | $8.00 | $15.00 | |
| GPT-5 Image Mini | OpenAI | $2.50 | $2.00 | |
| Nano Banana Pro (Gemini 3 Pro Image) | $2.00 | $12.00 | ||
| Nano Banana Pro (Gemini 3 Pro Image Preview) | $2.00 | $12.00 | ||
| Nano Banana 2 (Gemini 3.1 Flash Image) | $0.50 | $3.00 | ||
| Nano Banana 2 (Gemini 3.1 Flash Image Preview) | $0.50 | $3.00 | ||
| Nano Banana (Gemini 2.5 Flash Image) | $0.30 | $2.50 | ||
| Midjourney v6.1 | Midjourney | Free | Free | |
| DALL-E 3 | OpenAI | Free | $40000.00 | |
| Stable Diffusion 3.5 | Stability AI | Free | $35000.00 | |
| FLUX.1 Pro | Black Forest Labs | Free | $50000.00 | |
| Ideogram 2.0 | Ideogram | Free | $80000.00 | |
| Recraft V3 | Recraft | Free | $40000.00 | |
| Imagen 3 | Free | $40000.00 | ||
| Adobe Firefly 3 | Adobe | Free | Free | |
| Leonardo Phoenix | Leonardo AI | Free | Free |
AI speed is measured by time-to-first-token (TTFT) and tokens-per-second (TPS). TTFT measures how quickly the model starts responding. TPS measures how fast it generates output. Both matter for different use cases.
Speed varies by provider. Groq-hosted Llama achieves the fastest inference. Among major providers, Gemini Flash and GPT-4o Mini are consistently fast. Reasoning models like o3 and R1 are intentionally slower for better accuracy.
Smaller, faster models may sacrifice some quality. However, provider optimizations (quantization, speculative decoding) can speed up models without quality loss. The same model runs at different speeds on different providers.