150 个模型上升, 137 个下降, 13 个不变,本周共追踪 300 个模型。
过去7天内排名绝对变动最大的10个模型。
| 模型 | 评分 | 7天变化 | 排名 |
|---|---|---|---|
| Reka Edgereka | 77.2 | +208 | #93 |
| MiMo-V2-ProXiaomi | 85.0 | +35 | #22 |
| Hunyuan A13B InstructTencent | 72.1 | -29 | #142 |
| gpt-oss-120b (free)OpenAI | 73.7 | +28 | #115 |
| GPT-5.2-CodexOpenAI | 85.0 | +27 | #32 |
| MiniMax M2MiniMax | 72.7 | -26 | #131 |
| Gemini 2.5 Pro Preview 05-06Google | 82.5 | -25 | #58 |
| Gemini 2.5 Pro Preview 06-05Google | 84.1 | -24 | #48 |
| Qwen3 30B A3BAlibaba | 71.3 | -24 | #147 |
| o4 Mini HighOpenAI | 85.0 | -23 | #44 |
本周排名提升最大的模型。
本周排名下降最大的模型。
10 个新模型本周进入排名。
| 模型 | 评分 | 排名 |
|---|---|---|
| Reka Edgereka | 77.2 | #93 |
| MiMo-V2-ProXiaomi | 85.0 | #22 |
| gpt-oss-120b (free)OpenAI | 73.7 | #115 |
| GPT-5.2-CodexOpenAI | 85.0 | #32 |
| Qwen3 30B A3B Instruct 2507Alibaba | 75.0 | #109 |
| Olmo 3.1 32B InstructAllen AI | 64.9 | #184 |
| o3OpenAI | 85.5 | #19 |
| gpt-oss-20b (free)OpenAI | 73.7 | #116 |
| Qwen3 VL 30B A3B ThinkingAlibaba | 85.0 | #42 |
| GPT-5 NanoOpenAI | 75.5 | #107 |
处于脆弱状态可能进一步恶化的模型。需密切关注。
| 模型 | 评分 | 7天变化 |
|---|---|---|
| 89.4 | +7 | |
| 85.7 | +8 | |
| 85.5 | +9 | |
| 85.5 | +20 | |
| 85.0 | +16 | |
| 85.0 | +35 | |
| 85.0 | -7 | |
| 85.0 | +7 | |
| 85.0 | +16 | |
| 85.0 | +7 | |
| 85.0 | -9 | |
| 85.0 | +27 | |
| 85.0 | +12 | |
| 85.0 | -16 | |
| 85.0 | -16 | |
| 85.0 | +7 | |
| 85.0 | +13 | |
| 85.0 | -16 | |
| 85.0 | +11 | |
| 85.0 | +18 | |
| 85.0 | +6 | |
| 85.0 | -23 | |
| 84.7 | +16 | |
| 84.6 | +7 | |
| 84.1 | +7 | |
| 84.1 | -24 | |
| 83.6 | -21 | |
| 83.6 | -12 | |
| 83.2 | -7 | |
| 83.0 | -13 | |
| 83.0 | -7 | |
| 82.7 | +7 | |
| 82.5 | -25 | |
| 82.4 | -23 | |
| 82.3 | +6 | |
| 82.2 | -13 | |
| 81.9 | -20 | |
| 81.9 | +8 | |
| 81.8 | +8 | |
| 81.3 | -13 | |
| 80.9 | +10 | |
| 80.9 | -9 | |
| 80.8 | +9 | |
| 80.5 | +12 | |
| 80.0 | -9 | |
| 79.4 | +6 | |
| 79.1 | +11 | |
| 79.0 | -6 | |
| 78.4 | -12 | |
| 78.3 | +7 | |
| 78.1 | -8 | |
| 77.8 | +10 | |
| 77.4 | -12 | |
| 77.3 | +9 | |
| 77.3 | +10 | |
| 77.2 | -8 | |
| 77.1 | -6 | |
| 77.0 | +6 | |
| 76.7 | -10 | |
| 76.6 | -18 | |
| 76.4 | +15 | |
| 76.0 | -11 | |
| 75.8 | +7 | |
| 75.6 | +9 | |
| 75.5 | +18 | |
| 75.0 | +22 | |
| 74.9 | -15 | |
| 74.9 | +7 | |
| 74.8 | -9 | |
| 73.7 | +28 | |
| 73.7 | +20 | |
| 73.6 | +15 | |
| 73.5 | +13 | |
| 73.5 | +16 | |
| 73.5 | +17 | |
| 73.2 | -19 | |
| 73.1 | -21 | |
| 72.8 | -19 | |
| 72.7 | +12 | |
| 72.7 | -26 | |
| 72.6 | +13 | |
| 72.6 | +15 | |
| 72.6 | +13 | |
| 72.5 | -19 | |
| 72.3 | -6 | |
| 72.2 | -20 | |
| 72.1 | -29 | |
| 71.9 | +8 | |
| 71.4 | +17 | |
| 71.4 | +6 | |
| 71.3 | -24 | |
| 71.2 | -19 | |
| 71.2 | +11 | |
| 71.1 | +15 | |
| 71.0 | +6 | |
| 70.1 | +9 | |
| 70.0 | -9 | |
| 70.0 | +10 | |
| 69.8 | -16 | |
| 69.7 | +12 | |
| 69.2 | -18 | |
| 68.8 | +10 | |
| 68.6 | -15 | |
| 68.4 | +14 | |
| 68.3 | -20 | |
| 68.2 | -12 | |
| 67.7 | -10 | |
| 67.7 | +14 | |
| 67.2 | +9 | |
| 66.9 | -15 | |
| 66.7 | +15 | |
| 66.0 | -11 | |
| 65.7 | -10 | |
| 65.5 | +18 | |
| 65.1 | +17 | |
| 64.9 | +22 | |
| 64.9 | +7 | |
| 64.9 | -6 | |
| 64.8 | +6 | |
| 64.8 | -6 | |
| 64.7 | -11 | |
| 64.6 | -8 | |
| 64.4 | -14 | |
| 64.1 | -6 | |
| 63.4 | +7 | |
| 63.4 | -10 | |
| 63.3 | +7 | |
| 63.2 | +9 | |
| 62.9 | -12 | |
| 62.7 | -6 | |
| 62.5 | -8 | |
| 62.5 | +6 | |
| 62.5 | +9 | |
| 62.4 | -13 | |
| 62.0 | +8 | |
| 61.4 | -17 | |
| 60.9 | +7 | |
| 60.8 | -13 | |
| 60.6 | +12 | |
| 60.4 | -10 | |
| 60.0 | -7 | |
| 59.9 | +6 | |
| 59.8 | +11 | |
| 59.0 | -10 | |
| 59.0 | +7 | |
| 58.9 | +9 | |
| 58.0 | -14 | |
| 58.0 | -10 | |
| 56.4 | +9 | |
| 55.5 | -7 | |
| 55.3 | -12 | |
| 55.1 | +6 | |
| 54.2 | +7 | |
| 54.1 | +6 | |
| 53.6 | +8 | |
| 53.2 | -12 | |
| 53.2 | -6 | |
| 52.3 | -14 | |
| 42.8 | +6 | |
| 39.9 | +6 |
本周综合评分前10的AI模型。
| # | 模型 | 评分 | 7天变化 |
|---|---|---|---|
| 1 | GPT-5.4 ProOpenAI | 94.0 | +4 |
| 2 | GPT-5.4OpenAI | 94.0 | -1 |
| 3 | GPT-5.4 MiniOpenAI | 93.3 | +1 |
| 4 | GPT-5.2 ProOpenAI | 92.7 | +2 |
| 5 | GPT-5.2OpenAI | 92.7 | -3 |
| 6 | Claude Opus 4.6Anthropic | 92.1 | +4 |
| 7 | GPT-5 ProOpenAI | 91.9 | -4 |
| 8 | o3 Deep ResearchOpenAI | 91.5 | -1 |
| 9 | Claude Opus 4.5Anthropic | 90.4 | +4 |
| 10 | GPT-5OpenAI | 90.0 | +1 |
总排名数
300
平均评分
68.0
前10变动
10/10
最活跃的服务商
OpenAI
The weekly report summarizes seven days of AI model ranking changes. It highlights the biggest gainers and losers by rank position, new models entering the leaderboard, models in a fragile watch-list state, and a snapshot of the current top 10. All data comes from hourly-updated composite scores across 290+ models.
Gainers and losers are ranked by their 7-day rank change, which measures how many positions a model moved up or down on the leaderboard over the past week. The models with the largest positive changes are the top gainers, and those with the largest negative changes are the top losers.
The watch list contains models currently in a "fragile" state, meaning their rankings are unstable and could degrade further. Monitoring these models is important if you rely on them in production, as they may experience significant quality or performance drops in the near term.