AI models ranked by multilingual performance using MMLU benchmark scores across languages. Find the best LLM for translation and non-English tasks.
GPT-5.4
Score: 95.4
73.1
Across all ranked models
114
With benchmark data
Top Best for Multilingual Models by Weighted Score
Top 15 models by weighted score
Benchmark Breakdown
Per-benchmark scores for top 10 models
Each model's score is a weighted average of its available benchmark results. When a model is missing some benchmarks, the weights are re-normalized across the benchmarks that are available. All scores are on a 0-100 scale. Data sourced from official model cards, published papers, and third-party evaluation platforms.
Based on our benchmark analysis, GPT-5.4 by OpenAI is currently the #1 ranked model for multilingual, with a weighted score of 95.4/100.
Models are ranked using a weighted average of MMLU, Arena Elo benchmark scores. All scores are normalized to a 0-100 scale.
We currently rank 114 models that have relevant benchmark data for multilingual tasks.