Five major Chinese AI labs - Alibaba (Qwen), DeepSeek, Zhipu AI (GLM), MiniMax, and ByteDance (Seed) - ship 88+ models that compete head-to-head with US frontier models. Most are open-weight. This page ranks the flagship models from each lab, then lists top multilingual frontier models that support Chinese.
Top models from five major Chinese AI labs. Ranked by published benchmarks, architectural innovation, and availability. 10 flagships listed.
Qwen 3.6 Plus is Alibaba's latest frontier model with a 1M token context window. Tops OmniDocBench v1.5 (91.2) and Terminal-Bench 2.0 (61.6%, beating Claude Opus 4.5). Vision, reasoning, and agentic capabilities. API-only access.
The largest open-weight Qwen model at 397B total parameters with 17B activated per token (MoE). Apache 2.0 licensed. Hybrid linear attention + sparse MoE architecture with 262K context. Strong multimodal and reasoning capabilities.
DeepSeek V3.2 achieves Arena Elo 1424 (highest among all tested models at release). 671B total / 37B active MoE with Multi-head Latent Attention (MLA). AIME 96% (beats GPT-5-High), IMO gold medal level. Fully open-weight.
DeepSeek R1 uses reinforcement learning for explicit chain-of-thought reasoning. Scored 96% on the Chinese National Medical Licensing Exam (vs GPT-o1 Pro at 75%). AIME 2025 87.5%, LiveCodeBench 84.4%. Fully open-weight with distilled variants at 7B, 14B, 32B, and 70B.
GLM 5.1 is a 744B MoE model (40B active) that tops SWE-Bench Pro at 58.4% (beating GPT-5.4 and Claude Opus 4.6). Trained entirely on Huawei Ascend 910B chips, demonstrating Chinese hardware independence. MMLU-Pro 89%. API-only.
GLM 4.5 is an open-weight 355B MoE model (32B active) with 131K context. MATH 500 at 98.2% (ties Claude Opus 4.6), AIME24 91%. Industry-best tool-use scores on tau-Bench and BFCL v3. 2.5-8x faster inference than comparable models. Apache 2.0 licensed.
MiniMax M2.7 introduces "self-evolution" - automated ML research that handles 30-50% of the RL workflow. SWE-Pro 56.2%, Terminal-Bench 2.0 57%. 204K context. Achieved 66.6% medal rate across 22 ML competitions. API-only.
MiniMax M2.5 is an open-weight model with Arena Elo 1403 and SWE-bench Verified 75.8%. 196K context. 37% faster end-to-end runtime than M2-her. Available on Hugging Face with a free tier variant.
Seed 2.0 Lite is ByteDance's multimodal model with 262K context, adaptive deep thinking, and sparse MoE architecture. Strong on document understanding (OmniDocBench, DUDE, MMLongBench) and video comprehension (TVBench, TempCompass). Surpasses human-level on EgoTempo. API-only.
The largest Qwen Coder model at 480B total / 35B active (MoE). Purpose-built for software engineering with 262K context. Part of Alibaba's code-specific model line alongside Coder Flash (1M context) and Coder Plus.
Top frontier models whose multilingual training mix supports Chinese. Ranked by composite score. Capped at 2 models per provider family.
Over 90% of top Chinese models use Mixture-of-Experts, activating only 5-10% of total parameters per token. DeepSeek pioneered Multi-head Latent Attention (MLA) for KV cache compression, now adopted across the industry.
Chinese labs release more open-weight frontier models than any other region. DeepSeek V3.2, Qwen3 series (Apache 2.0), GLM-4.5/5, and MiniMax M2.5 are all freely downloadable. This enables local deployment without API dependency.
MoE architecture plus aggressive pricing makes Chinese models 3-10x cheaper than US equivalents at comparable quality. DeepSeek V3.2 input costs $0.26/M vs Claude Opus at $15/M. Qwen models start at $0.03/M input.
All Chinese-origin models with published pricing, sorted by input cost (lowest first). Updated hourly.
| Model | In $/M | Out $/M |
|---|---|---|
| Qwen2.5 Coder 7B InstructAlibaba | $0.030 | $0.090 |
| Qwen-TurboAlibaba | $0.033 | $0.130 |
| Qwen2.5 7B InstructAlibaba | $0.040 | $0.100 |
| Qwen3.5-9BAlibaba | $0.050 | $0.150 |
| Qwen3 8BAlibaba | $0.050 | $0.400 |
| GLM 4.7 FlashZhipu AI | $0.060 | $0.400 |
| Qwen3 14BAlibaba | $0.060 | $0.240 |
| Qwen3.5-FlashAlibaba | $0.065 | $0.260 |
| Qwen3 Coder 30B A3B InstructAlibaba | $0.070 | $0.270 |
| Qwen3 235B A22B Instruct 2507Alibaba | $0.071 | $0.100 |
| Seed 1.6 FlashByteDance | $0.075 | $0.300 |
| Qwen3 VL 8B InstructAlibaba | $0.080 | $0.500 |
| Qwen3 30B A3B Thinking 2507Alibaba | $0.080 | $0.400 |
| Qwen3 30B A3BAlibaba | $0.080 | $0.280 |
| Qwen3 32BAlibaba | $0.080 | $0.240 |
| Tongyi DeepResearch 30B A3BAlibaba | $0.090 | $0.450 |
| Qwen3 Next 80B A3B InstructAlibaba | $0.090 | $1.10 |
| Qwen3 30B A3B Instruct 2507Alibaba | $0.090 | $0.300 |
| Qwen3 Next 80B A3B ThinkingAlibaba | $0.098 | $0.780 |
| Seed-2.0-MiniByteDance | $0.100 | $0.400 |
| GLM 4 32B Zhipu AI | $0.100 | $0.100 |
| UI-TARS 7B ByteDance | $0.100 | $0.200 |
| Qwen3 VL 32B InstructAlibaba | $0.104 | $0.416 |
| Qwen3 VL 8B ThinkingAlibaba | $0.117 | $1.36 |
| MiniMax M2.5MiniMax | $0.118 | $0.990 |
| Qwen3 Coder NextAlibaba | $0.120 | $0.750 |
| Qwen2.5 72B InstructAlibaba | $0.120 | $0.390 |
| Qwen3 VL 30B A3B ThinkingAlibaba | $0.130 | $1.56 |
| Qwen3 VL 30B A3B InstructAlibaba | $0.130 | $0.520 |
| GLM 4.5 AirZhipu AI | $0.130 | $0.850 |
| Qwen VL PlusAlibaba | $0.137 | $0.410 |
| Qwen3 235B A22B Thinking 2507Alibaba | $0.150 | $1.50 |
| DeepSeek V3.1DeepSeek | $0.150 | $0.750 |
| QwQ 32BAlibaba | $0.150 | $0.580 |
| Qwen3.5-35B-A3BAlibaba | $0.163 | $1.30 |
| Qwen3.5-27BAlibaba | $0.195 | $1.56 |
| Qwen3 Coder FlashAlibaba | $0.195 | $0.975 |
| Qwen3 VL 235B A22B InstructAlibaba | $0.200 | $0.880 |
| Qwen2.5 VL 32B InstructAlibaba | $0.200 | $0.600 |
| DeepSeek V3 0324DeepSeek | $0.200 | $0.770 |
| MiniMax-01MiniMax | $0.200 | $1.10 |
| DeepSeek V3.1 TerminusDeepSeek | $0.210 | $0.790 |
| Qwen3 Coder 480B A35BAlibaba | $0.220 | $1.00 |
| Seed-2.0-LiteByteDance | $0.250 | $2.00 |
| Seed 1.6ByteDance | $0.250 | $2.00 |
| MiniMax M2MiniMax | $0.255 | $1.00 |
| Qwen3.5-122B-A10BAlibaba | $0.260 | $2.08 |
| Qwen3.5 Plus 2026-02-15Alibaba | $0.260 | $1.56 |
| DeepSeek V3.2DeepSeek | $0.260 | $0.380 |
| Qwen3 VL 235B A22B ThinkingAlibaba | $0.260 | $2.60 |
| Qwen Plus 0728 (thinking)Alibaba | $0.260 | $0.780 |
| Qwen Plus 0728Alibaba | $0.260 | $0.780 |
| Qwen-PlusAlibaba | $0.260 | $0.780 |
| DeepSeek V3.2 ExpDeepSeek | $0.270 | $0.410 |
| MiniMax M2.1MiniMax | $0.290 | $0.950 |
| R1 Distill Qwen 32BDeepSeek | $0.290 | $0.290 |
| MiniMax M2.7MiniMax | $0.300 | $1.20 |
| MiniMax M2-herMiniMax | $0.300 | $1.20 |
| GLM 4.6VZhipu AI | $0.300 | $0.900 |
| DeepSeek V3DeepSeek | $0.320 | $0.890 |
| Qwen3.6 PlusAlibaba | $0.325 | $1.95 |
| Qwen3.5 397B A17BAlibaba | $0.390 | $2.34 |
| GLM 4.7Zhipu AI | $0.390 | $1.75 |
| GLM 4.6Zhipu AI | $0.390 | $1.90 |
| DeepSeek V3.2 SpecialeDeepSeek | $0.400 | $1.20 |
| MiniMax M1MiniMax | $0.400 | $2.20 |
| R1 0528DeepSeek | $0.450 | $2.15 |
| Qwen3 235B A22BAlibaba | $0.455 | $1.82 |
| Qwen VL MaxAlibaba | $0.520 | $2.08 |
| GLM 4.5VZhipu AI | $0.600 | $1.80 |
| GLM 4.5Zhipu AI | $0.600 | $2.20 |
| Qwen3 Coder PlusAlibaba | $0.650 | $3.25 |
| Qwen2.5 Coder 32B InstructAlibaba | $0.660 | $1.00 |
| R1 Distill Llama 70BDeepSeek | $0.700 | $0.800 |
| R1DeepSeek | $0.700 | $2.50 |
| GLM 5Zhipu AI | $0.720 | $2.30 |
| Qwen3 Max ThinkingAlibaba | $0.780 | $3.90 |
| Qwen3 MaxAlibaba | $0.780 | $3.90 |
| Qwen2.5 VL 72B InstructAlibaba | $0.800 | $0.800 |
| GLM 5.1Zhipu AI | $0.950 | $3.15 |
| Qwen-Max Alibaba | $1.04 | $4.16 |
| GLM 5V TurboZhipu AI | $1.20 | $4.00 |
| GLM 5 TurboZhipu AI | $1.20 | $4.00 |
It depends on the use case. For raw benchmark performance, DeepSeek V3.2 holds the highest Arena Elo (1424) among all tested models and is fully open-weight. For 1M-context workloads, Qwen3.6 Plus leads with MMLU-Pro 88.5%. For reasoning tasks, DeepSeek R1 scored 96% on the Chinese National Medical Licensing Exam. For software engineering, GLM 5.1 tops SWE-Bench Pro at 58.4%. For agentic self-improvement, MiniMax M2.7 automates 30-50% of ML research workflows. All five labs have competitive flagships.
Most are. DeepSeek releases all models (V3, V3.1, V3.2, R1) as open-weight under permissive licenses. Alibaba's Qwen3 series is Apache 2.0 licensed, including the 397B flagship. Zhipu AI open-sources GLM-4.5, GLM-4.6, GLM-4.7, and GLM-5 on Hugging Face. MiniMax offers M2.5 (free variant available) and MiniMax-01 as open-weight. The main closed-source exceptions are Qwen3.6 Plus, GLM-5.1, MiniMax M2.7, and ByteDance Seed.
Chinese-origin models consistently outperform multilingual frontier models on Chinese-specific benchmarks like C-Eval, CMMLU, and SuperCLUE because they were trained with significantly more Chinese-language data. DeepSeek R1 scored 96% on the Chinese medical licensing exam versus 75% for GPT-o1 Pro. However, on English reasoning benchmarks like MMLU and GPQA, the gap has narrowed - DeepSeek V3.2 and Qwen3.6 Plus now match or exceed GPT-5 on several tasks.
Mixture-of-Experts (MoE) activates only a fraction of total parameters per token, which dramatically reduces inference cost. DeepSeek V3.2 has 671B total parameters but only activates 37B per token. Qwen3.5-397B activates 17B. GLM-5.1 has 744B total with 40B active. This architectural choice lets Chinese labs build models that match or exceed the performance of dense US models while running at a fraction of the compute cost. DeepSeek pioneered Multi-head Latent Attention (MLA) which further compresses the KV cache.
Qwen3.6 Plus and Qwen3 Coder Flash both support 1M tokens (roughly 750,000 words). MiniMax-01 supports 1M+ tokens. Most DeepSeek models support 163K tokens. Zhipu AI GLM models range from 131K to 202K tokens. ByteDance Seed models support 262K tokens. For comparison, Claude supports up to 200K and GPT-5 supports 128K-1M depending on variant.
Yes. Most Chinese labs publish open-weight models on Hugging Face and GitHub that you can run with standard tools (vLLM, llama.cpp, Ollama, transformers). DeepSeek R1 Distill Qwen 32B and Qwen3-32B are popular choices for local deployment. Smaller variants like Qwen3-8B, GLM-4.5 Air (12B active), and DeepSeek R1 Distill 7B run on consumer GPUs. The MoE architecture means even large models like Qwen3.5-397B only need enough VRAM for the active 17B parameters during inference.