| 1 | Grok 4.20 Beta(回退)xAI | xAI | 99.3 | -- | -- | -- | 99.3 |
| 2 | Gemini 3.1 Pro Preview(回退)Google | Google | 98.7 | -- | -- | -- | 98.7 |
| 3 | GPT-5.2 Chat(回退)OpenAI | OpenAI | 96.8 | -- | -- | -- | 96.8 |
| 4 | Grok 4.1 Fast(回退)xAI | xAI | 95.5 | -- | -- | -- | 95.5 |
| 5 | GPT-5.1(回退)OpenAI | OpenAI | 92.7 | -- | -- | -- | 92.7 |
| 6 | Gemini 3 Flash PreviewGoogle | Google | 92 | -- | 92 | -- | -- |
| 7 | Mistral LargeMistral AI | Mistral AI | 92 | -- | 92 | -- | -- |
| 8 | Qwen3.5 397B A17B(回退)Alibaba | Alibaba | 91.7 | -- | -- | -- | 91.7 |
| 9 | Claude Opus 4.1(回退)Anthropic | Anthropic | 91.5 | -- | -- | -- | 91.5 |
| 10 | Grok 3xAI | xAI | 90.5 | -- | 90.5 | -- | -- |
| 11 | GPT-4oOpenAI | OpenAI | 90.2 | -- | 90.2 | -- | -- |
| 12 | Claude Haiku 4.5Anthropic | Anthropic | 89.8 | -- | 89.8 | -- | -- |
| 13 | Gemini 3.1 Flash Lite Preview(回退)Google | Google | 89.5 | -- | -- | -- | 89.5 |
| 14 | Llama 4 MaverickMeta | Meta | 89.5 | -- | 89.5 | -- | -- |
| 15 | Gemini 2.0 FlashGoogle | Google | 89.4 | -- | 89.4 | -- | -- |
| 16 | Llama 3.3 70B InstructMeta | Meta | 88.4 | -- | 88.4 | -- | -- |
| 17 | Claude 3.5 HaikuAnthropic | Anthropic | 88.1 | -- | 88.1 | -- | -- |
| 18 | GPT-5 Chat(回退)OpenAI | OpenAI | 87.8 | -- | -- | -- | 87.8 |
| 19 | GPT-5.4OpenAI | OpenAI | 87.5 | 80 | 97.5 | -- | -- |
| 20 | DeepSeek V3.2 Exp(回退)DeepSeek | DeepSeek | 87.3 | -- | -- | -- | 87.3 |
| 21 | DeepSeek V3.2(回退)DeepSeek | DeepSeek | 87.2 | -- | -- | -- | 87.2 |
| 22 | GPT-4o-miniOpenAI | OpenAI | 87.2 | -- | 87.2 | -- | -- |
| 23 | GPT-4 TurboOpenAI | OpenAI | 87.1 | -- | 87.1 | -- | -- |
| 24 | Claude Opus 4.5Anthropic | Anthropic | 87 | 80.9 | 95.2 | -- | -- |
| 25 | Grok 4 Fast(回退)xAI | xAI | 87 | -- | -- | -- | 87 |
| 26 | Qwen3.5-122B-A10B(回退)Alibaba | Alibaba | 86.5 | -- | -- | -- | 86.5 |
| 27 | DeepSeek V3.1(回退)DeepSeek | DeepSeek | 86.5 | -- | -- | -- | 86.5 |
| 28 | DeepSeek V3.1 Terminus(回退)DeepSeek | DeepSeek | 86.2 | -- | -- | -- | 86.2 |
| 29 | GPT-5.2OpenAI | OpenAI | 86.1 | 78 | 97 | -- | -- |
| 30 | Qwen3 VL 235B A22B Instruct(回退)Alibaba | Alibaba | 86 | -- | -- | -- | 86 |
| 31 | Qwen3.5-27B(回退)Alibaba | Alibaba | 85 | -- | -- | -- | 85 |
| 32 | DeepSeek V3 0324DeepSeek | DeepSeek | 84.5 | -- | 84.5 | -- | -- |
| 33 | GPT-5OpenAI | OpenAI | 84.2 | 75 | 96.5 | -- | -- |
| 34 | MiniMax M2.5(回退)MiniMax | MiniMax | 84 | -- | -- | -- | 84 |
| 35 | Claude Opus 4.6Anthropic | Anthropic | 83.9 | 83.7 | 96 | 72.1 | -- |
| 36 | Qwen3 Next 80B A3B Instruct(回退)Alibaba | Alibaba | 83.7 | -- | -- | -- | 83.7 |
| 37 | LongCat Flash Chat(回退)Meituan | Meituan | 83.5 | -- | -- | -- | 83.5 |
| 38 | Qwen3.5-Flash(回退)Alibaba | Alibaba | 83.3 | -- | -- | -- | 83.3 |
| 39 | Qwen3.5-35B-A3B(回退)Alibaba | Alibaba | 83 | -- | -- | -- | 83 |
| 40 | Qwen3 VL 235B A22B Thinking(回退)Alibaba | Alibaba | 82.7 | -- | -- | -- | 82.7 |
| 41 | Phi 4Microsoft | Microsoft | 82.6 | -- | 82.6 | -- | -- |
| 42 | DeepSeek V3DeepSeek | DeepSeek | 82.6 | -- | 82.6 | -- | -- |
| 43 | Claude Opus 4Anthropic | Anthropic | 82.1 | 72.5 | 95 | -- | -- |
| 44 | GPT-5 Mini(回退)OpenAI | OpenAI | 81.8 | -- | -- | -- | 81.8 |
| 45 | Step 3.5 Flash(回退)StepFun | StepFun | 81.5 | -- | -- | -- | 81.5 |
| 46 | Claude 3.7 Sonnet (thinking)(回退)Anthropic | Anthropic | 81.3 | -- | -- | -- | 81.3 |
| 47 | o3OpenAI | OpenAI | 81.1 | 69.1 | 97 | -- | -- |
| 48 | Grok 4xAI | xAI | 80.9 | 70 | 95.5 | -- | -- |
| 49 | Claude 3.7 SonnetAnthropic | Anthropic | 80.5 | 70.3 | 94 | -- | -- |
| 50 | Llama 3.1 70B InstructMeta | Meta | 80.5 | -- | 80.5 | -- | -- |
| 51 | o4 MiniOpenAI | OpenAI | 79.6 | 68.1 | 95 | -- | -- |
| 52 | Claude Sonnet 4.5Anthropic | Anthropic | 79.4 | 68 | 94.5 | -- | -- |
| 53 | Claude Sonnet 4.6Anthropic | Anthropic | 78.9 | 74.6 | 95.2 | 68.4 | -- |
| 54 | Qwen3 Next 80B A3B Thinking(回退)Alibaba | Alibaba | 78.2 | -- | -- | -- | 78.2 |
| 55 | MiniMax M1(回退)MiniMax | MiniMax | 77.8 | -- | -- | -- | 77.8 |
| 56 | o3 Mini High(回退)OpenAI | OpenAI | 77.3 | -- | -- | -- | 77.3 |
| 57 | Grok 3 Mini Beta(回退)xAI | xAI | 76.3 | -- | -- | -- | 76.3 |
| 58 | Claude Sonnet 4Anthropic | Anthropic | 75.9 | 62.5 | 93.8 | -- | -- |
| 59 | gpt-oss-120b(回退)OpenAI | OpenAI | 75.8 | -- | -- | -- | 75.8 |
| 60 | Command A(回退)Cohere | Cohere | 75.7 | -- | -- | -- | 75.7 |
| 61 | MiniMax M2(回退)MiniMax | MiniMax | 74.7 | -- | -- | -- | 74.7 |
| 62 | Qwen3 8B(回退)Alibaba | Alibaba | 74.7 | -- | -- | -- | 74.7 |
| 63 | GPT-4o (2024-05-13)(回退)OpenAI | OpenAI | 74.3 | -- | -- | -- | 74.3 |
| 64 | Llama 3.3 Nemotron Super 49B V1.5(回退)NVIDIA | NVIDIA | 73.7 | -- | -- | -- | 73.7 |
| 65 | GPT-5 Nano(回退)OpenAI | OpenAI | 73 | -- | -- | -- | 73 |
| 66 | Nova 2 Lite(回退)Amazon | Amazon | 73 | -- | -- | -- | 73 |
| 67 | QwQ 32B(回退)Alibaba | Alibaba | 72.7 | -- | -- | -- | 72.7 |
| 68 | GPT-4o (2024-08-06)(回退)OpenAI | OpenAI | 72.5 | -- | -- | -- | 72.5 |
| 69 | Olmo 3.1 32B Instruct(回退)Allen AI | Allen AI | 71.8 | -- | -- | -- | 71.8 |
| 70 | GPT-4.1OpenAI | OpenAI | 70.4 | 54.6 | 91.5 | -- | -- |
| 71 | GPT-4o-mini (2024-07-18)(回退)OpenAI | OpenAI | 69.7 | -- | -- | -- | 69.7 |
| 72 | gpt-oss-20b(回退)OpenAI | OpenAI | 69.7 | -- | -- | -- | 69.7 |
| 73 | Gemma 2 27BGoogle | Google | 69.5 | -- | 69.5 | -- | -- |
| 74 | Claude 3.5 SonnetAnthropic | Anthropic | 69.2 | 50.8 | 93.7 | -- | -- |
| 75 | Mistral Large 2407(回退)Mistral AI | Mistral AI | 69 | -- | -- | -- | 69 |
| 76 | Mercury(回退)Inception | Inception | 68.2 | -- | -- | -- | 68.2 |
| 77 | Olmo 3 32B Think(回退)Allen AI | Allen AI | 67.8 | -- | -- | -- | 67.8 |
| 78 | o1OpenAI | OpenAI | 67.5 | 48.9 | 92.4 | -- | -- |
| 79 | Qwen2.5 72B Instruct(回退)Alibaba | Alibaba | 67.2 | -- | -- | -- | 67.2 |
| 80 | Llama 3.1 Nemotron 70B Instruct(回退)NVIDIA | NVIDIA | 66.5 | -- | -- | -- | 66.5 |
| 81 | Olmo 3.1 32B Think(回退)Allen AI | Allen AI | 64.2 | -- | -- | -- | 64.2 |
| 82 | Gemini 2.5 ProGoogle | Google | 63.8 | 63.8 | -- | -- | -- |
| 83 | Llama 3 70B Instruct(回退)Meta | Meta | 62.7 | -- | -- | -- | 62.7 |
| 84 | Gemini 2.5 FlashGoogle | Google | 62.6 | 42 | 90 | -- | -- |
| 85 | Qwen2.5 Coder 32B Instruct(回退)Alibaba | Alibaba | 61.8 | -- | -- | -- | 61.8 |
| 86 | Claude 3 Haiku(回退)Anthropic | Anthropic | 60.2 | -- | -- | -- | 60.2 |
| 87 | Command R (08-2024)(回退)Cohere | Cohere | 58.3 | -- | -- | -- | 58.3 |
| 88 | R1 0528DeepSeek | DeepSeek | 57.6 | 57.6 | -- | -- | -- |
| 89 | Llama 3 8B Instruct(回退)Meta | Meta | 53.8 | -- | -- | -- | 53.8 |
| 90 | Llama 3.1 8B Instruct(回退)Meta | Meta | 52 | -- | -- | -- | 52 |
| 91 | o3 MiniOpenAI | OpenAI | 49.3 | 49.3 | -- | -- | -- |
| 92 | R1DeepSeek | DeepSeek | 49.2 | 49.2 | -- | -- | -- |
| 93 | Llama 3.2 3B Instruct(回退)Meta | Meta | 44.5 | -- | -- | -- | 44.5 |
| 94 | Llama 3.2 1B Instruct(回退)Meta | Meta | 35.2 | -- | -- | -- | 35.2 |