AI模型排行榜总览
所有实时排行榜的入口: 主排行榜覆盖345个模型, 另有8个专项榜单和12个工具榜单, 数据每小时刷新。
主排行榜
全部345个模型的综合排名
专项榜单
Coding
SWE-bench, HumanEval and BigCodeBench weighted ranking
Math
MATH-500, GSM8K and AIME 2024 composite
Reasoning
GPQA Diamond and multi-step logic benchmarks
Writing
Long-form quality and instruction adherence
Instruction Following
IFEval-driven strictness scores
Data Analysis
Tabular reasoning and code-interpreter tasks
Roleplay
Character consistency and creative dialogue
Multilingual
Cross-language benchmarks beyond English
工具榜单
开源LLM排行榜
开放权重模型的独立排名