Arena 榜单

Last updated: 4h ago

这个页面面向 “LLM arena”“Arena Elo leaderboard”“AI arena rankings” 这类搜索意图，集中展示当前最有代表性的对抗型和快速评估榜单。

Arena Elo 上榜模型

LiveBench 上榜模型

Arena 头部服务商

Arena 头部开源模型

Arena Elo

Human preference rating from 6M+ crowdsourced blind head-to-head comparisons. Users chat with two anonymous models and pick the better response.

#	模型	服务商	分数
#1	Claude Fable 5	Anthropic	1508
#2	Claude Opus 4.6	Anthropic	1503
#3	Gemini 3.1 Pro Preview	Google	1494
#4	Gemini 3.1 Pro	Google	1494
#5	Claude Opus 4.7	Anthropic	1491
#6	Gemini 3 Pro	Google	1486
#7	GPT-5.4	OpenAI	1485
#8	GPT-5.2	OpenAI	1481
#9	Claude Opus 4.8	Anthropic	1479
#10	Gemini 3.5 Flash	Google	1477
#11	GPT-5.2 Chat	OpenAI	1476
#12	GPT-5.1	OpenAI	1475
#13	GPT-5.5	OpenAI	1475
#14	Gemini 3 Flash	Google	1474
#15	GLM 5.1 开源	Zhipu AI	1473

LiveBench

Comprehensive benchmark across 6 categories (math, coding, reasoning, data analysis, instruction following, language) using contamination-resistant, regularly updated questions.

1. o4 Mini High

OpenAI

87.3%

2. R1 0528

DeepSeek

84.4%

3. Qwen3 235B A22B

Alibaba

80.4%

4. Claude 3.5 Sonnet

Anthropic

80.0%