126 AI models with API pricing under $1 per million output tokens. These ultra-budget options deliver surprising quality - many include vision, reasoning, and function calling capabilities.
| # | Model | Score | Input $/1M | Output $/1M |
|---|---|---|---|---|
| 1 | Gemma 4 31BGoogle | 81 | $0.130 | $0.380 |
| 2 | Gemini 2.5 Flash Lite Preview 09-2025Google | 79 | $0.100 | $0.400 |
| 3 | Gemini 2.5 Flash LiteGoogle | 79 | $0.100 | $0.400 |
| 4 | Grok 4.1 FastxAI | 78 | $0.200 | $0.500 |
| 5 | Gemma 2 27BGoogle | 77 | $0.650 | $0.650 |
| 6 | DeepSeek V4 ProDeepSeek | 76 | $0.435 | $0.870 |
| 7 | Gemma 4 26B A4B Google | 73 | $0.060 | $0.330 |
| 8 | Grok 4 FastxAI | 73 | $0.200 | $0.500 |
| 9 | Gemini 2.0 FlashGoogle | 72 | $0.100 | $0.400 |
| 10 | DeepSeek V4 FlashDeepSeek | 72 | $0.140 | $0.280 |
| 11 | DeepSeek V3 0324DeepSeek | 72 | $0.200 | $0.770 |
| 12 | GLM 4.5 AirZhipu AI | 71 | $0.130 | $0.850 |
| 13 | DeepSeek V3.2DeepSeek | 70 | $0.252 | $0.378 |
| 14 | DeepSeek V3.2 ExpDeepSeek | 70 | $0.270 | $0.410 |
| 15 | MiniMax M2.1MiniMax | 70 | $0.290 | $0.950 |
| 16 | DeepSeek V3DeepSeek | 70 | $0.320 | $0.890 |
| 17 | Qwen3 VL 235B A22B InstructAlibaba | 69 | $0.200 | $0.880 |
| 18 | DeepSeek V3.1 TerminusDeepSeek | 69 | $0.270 | $0.950 |
| 19 | GPT-4o-miniOpenAI | 69 | $0.150 | $0.600 |
| 20 | DeepSeek V3.1DeepSeek | 69 | $0.150 | $0.750 |
| 21 | Hy3 previewTencent | 69 | $0.066 | $0.260 |
| 22 | Qwen3.5-FlashAlibaba | 69 | $0.065 | $0.260 |
| 23 | Llama 4 MaverickMeta | 67 | $0.150 | $0.600 |
| 24 | Step 3.5 FlashStepFun | 67 | $0.100 | $0.300 |
| 25 | Llama 3.3 70B InstructMeta | 67 | $0.100 | $0.320 |
| 26 | Qwen3.5-9BAlibaba | 67 | $0.040 | $0.150 |
| 27 | Llama 3.1 70B InstructMeta | 65 | $0.400 | $0.400 |
| 28 | Trinity Large Thinkingarcee-ai | 65 | $0.220 | $0.850 |
| 29 | GLM 4.6VZhipu AI | 65 | $0.300 | $0.900 |
| 30 | Qwen3 235B A22B Instruct 2507Alibaba | 65 | $0.071 | $0.100 |
| 31 | Qwen3 30B A3B Thinking 2507Alibaba | 64 | $0.080 | $0.400 |
| 32 | GLM 4.7 FlashZhipu AI | 64 | $0.060 | $0.400 |
| 33 | Qwen3 Next 80B A3B ThinkingAlibaba | 64 | $0.098 | $0.780 |
| 34 | Qwen3 30B A3BAlibaba | 64 | $0.090 | $0.450 |
| 35 | Trinity Large Previewarcee-ai | 64 | $0.150 | $0.450 |
| 36 | Grok 3 Mini BetaxAI | 63 | $0.300 | $0.500 |
| 37 | Mercury 2Inception | 61 | $0.250 | $0.750 |
| 38 | GPT-4o-mini Search PreviewOpenAI | 61 | $0.150 | $0.600 |
| 39 | Llama 3.3 Nemotron Super 49B V1.5NVIDIA | 61 | $0.100 | $0.400 |
| 40 | Qwen3 8BAlibaba | 61 | $0.050 | $0.400 |
At $0.50/1M output tokens, generating a 2,000-word blog post (~2,500 tokens) costs about $0.00125 - roughly 1/10th of a penny. You could generate 800 blog posts for $1. For chatbots, even heavy usage stays under a few dollars per month.
Sub-$1 models have improved dramatically. Many score above 70 on our composite index and include advanced features like vision and reasoning. The quality gap between budget and premium models continues to shrink with each generation.
Budget models are ideal for: high-volume batch processing, simple classification tasks, draft generation with human review, prototype development, and any use case where cost per request matters more than peak quality.
Premium models justify their cost for: complex reasoning tasks, customer-facing applications requiring top accuracy, specialized code generation, and scenarios where errors have high downstream costs.
Many sub-$1 models deliver surprising quality. Several score above 70 on our composite index and include advanced features like vision, reasoning, and function calling. They are well-suited for high-volume batch processing, classification tasks, draft generation, and prototype development. The quality gap between budget and premium models continues to shrink with each generation.
At $0.50 per million output tokens, generating a 2,000-word blog post (approximately 2,500 tokens) costs about $0.00125 - roughly one-tenth of a penny. You could generate 800 blog posts for just $1. Even heavy chatbot usage typically stays under a few dollars per month at these price points.
Many budget models include vision (image understanding), function calling (tool use), JSON mode for structured output, and streaming. Some even offer reasoning capabilities. The main trade-off compared to premium models is usually in complex multi-step reasoning, nuanced writing quality, and handling edge cases.