Updated Jun 14, 2026

LMArena Leaderboard — June 2026

What the LMArena actually is, how to read an Arena Elo score, and the current top 10 for June 2026. The original human-preference benchmark that started as LMSys Chatbot Arena and now anchors most enterprise model selection conversations.

What is the LMArena, in one paragraph?

The LMArena is a public, blind side-by-side voting site for AI chat models. A user submits a prompt, two anonymous models reply, the user picks a winner, and the project aggregates millions of such votes into Elo ratings. It started in 2023 as the LMSys Chatbot Arena out of UC Berkeley and rebranded to LMArena.ai in 2024-25 as it spun out into an independent project. The current June 2026 top 10 is below — three models now sit above the historical 1500 Elo barrier on text, with the open-weights tier within striking distance of the closed-source frontier.

How to read an Arena Elo score

Reference table for Arena Elo bands (June 2026)

  1510+   Frontier #1      Claude Opus 4.8 (AAII 61.4, coding & overall #1)
  1500    Frontier         Gemini 3.1 Pro, Claude Opus 4.7, GPT-5.5 Pro
  1450    Frontier-adj.    DeepSeek V4 Pro, Qwen 3.7 Max
  1400    Strong tier      GPT-4.1, Claude Sonnet 4, Gemini 2.5 Pro
  1300    Capable tier     Llama 4 Maverick, Mistral Large 3
  1200    Solid daily      Gemma 4, Phi-4, Mistral Small 3
  1100    Light tasks      DeepSeek V4 Flash, GPT-4o Mini
   <1100  Legacy tier      Older 2023-24 model generations

A 100-Elo gap means the higher-rated model wins ~64% of head-to-heads.
A 200-Elo gap means it wins ~76%. Rating shifts under 25 points are noise.

Live Leaderboard

357 models
#ModelQualityArena ELOSpeedPriceContextValueReleased
1

Anthropic · Frontier agentic coding & knowledge work

100
152558 t/s$10 / $501M3.3Jun 2026
2

Anthropic · Coding, agents & computer use

99
151272 t/s$5 / $251M6.6May 2026
3

OpenAI · Reasoning at any cost

98
151068 t/s$30 / $1801M0.9Apr 2026
4

OpenAI · Frontier general purpose

97
150670 t/s$5 / $301M5.5Apr 2026
5

OpenAI · Complex analysis

97
$30 / $1801M0.9Mar 2026
6

OpenAI · Complex analysis

97
$21 / $168400K1.0Dec 2025
7

Anthropic · Complex analysis

97
$30 / $1501M1.1May 2026
8

Anthropic · Coding & agentic workflows

96
150568 t/s$5 / $251M6.4Apr 2026
9

OpenAI · Deep research

96
$10 / $40200K3.8Oct 2025
10

OpenAI · Deep research

96
$2 / $8200K19.2Oct 2025
11

OpenAI · Hard reasoning

96
$20 / $80200K1.9Jun 2025
12

Google · Speed & cost

96
1505$2 / $121M13.7Feb 2026
13

Google · Science & long-context

96
1505131 t/s$2 / $121M13.7Apr 2026
14

Anthropic · General purpose

95
1490$5 / $251M6.3Feb 2026
15

Anthropic · General purpose

95
$5 / $25200K6.3Nov 2025
16

Anthropic · Complex analysis

95
$30 / $1501M1.1Apr 2026
17

Google · Image generation

94
$2 / $1266K13.4Nov 2025
18

Anthropic · Multimodal

94
$15 / $75200K2.1Aug 2025
19

OpenAI · Hard reasoning

94
137068 t/s$10 / $40200K3.8Apr 2025
20

Alibaba Cloud · Long autonomous agentic runs

94
148890 t/s$2.5 / $7.51M18.8May 2026
21

xAI · Agentic tasks & real-time info

93
149683 t/s$1.25 / $2.51M49.6May 2026
22

OpenAI · General purpose

93
1495$2.5 / $151M10.6Mar 2026
23

OpenAI · General purpose

93
$1.75 / $14128K11.8Mar 2026
24

OpenAI · Code generation

93
$1.75 / $14400K11.8Feb 2026
25

OpenAI · Code generation

93
$1.75 / $14400K11.8Jan 2026
26

OpenAI · General purpose

93
$1.75 / $14128K11.8Dec 2025
27

OpenAI · General purpose

93
$1.75 / $14400K11.8Dec 2025
28

OpenAI · Code generation

93
$1.25 / $10400K16.5Dec 2025
29

OpenAI · General purpose

93
$1.25 / $10400K16.5Nov 2025
30

OpenAI · General purpose

93
$1.25 / $10128K16.5Nov 2025
31

OpenAI · Code generation

93
$1.25 / $10400K16.5Nov 2025
32

OpenAI · Hard reasoning

93
$150 / $600200K0.2Mar 2025
33

OpenAI · Complex analysis

93
$30 / $608K2.1May 2023
34

OpenAI · Multimodal

93
$30 / $608K2.1May 2023
35

xAI · General purpose

93
1496$1.25 / $2.52M49.6Mar 2026
36

OpenAI · Complex analysis

93
$8 / $15272K8.1Apr 2026
37

Moonshot AI · Frontier quality at low cost

92
146648 t/s$0.73 / $3.49256K43.6Apr 2026
38

Google · Multimodal + value

92
134587 t/s$1.25 / $101M16.4Mar 2025
39

Anthropic · Complex analysis

91
136052 t/s$15 / $75200K2.0May 2025
40

· Hard reasoning

91
$0.3 / $1.1164K130.0Jul 2025
41

Google · Speed & cost

91
$1.25 / $101M16.2Jun 2025
42

DeepSeek · Hard reasoning

91
$0.5 / $2.15164K68.7May 2025
43

Google · Speed & cost

91
$1.25 / $101M16.2May 2025
44

DeepSeek · Hard reasoning

91
$0.29 / $0.2933K313.8Jan 2025
45

DeepSeek · Hard reasoning

91
$0.7 / $0.8131K121.3Jan 2025
46

DeepSeek · Hard reasoning

91
$0.7 / $2.564K56.9Jan 2025
47

Moonshot AI · Open-weight agentic coding

91
55 t/s$0.73 / $3.49256K43.1Jun 2026
48

· Open-weight reasoning & tool use

91
50 t/s$0.2 / $0.8262K182.0Jun 2026
49

DeepSeek · Open-source value leader

90
146733 t/s$1.74 / $3.481M34.5Apr 2026
50

Anthropic · Coding & balance

90
146773 t/s$3 / $151M10.0Feb 2026
51

OpenAI · General purpose

90
1455$1.25 / $10400K16.0Aug 2025
52

xAI · General purpose

90
$3 / $15131K10.0Apr 2025
53

Alibaba Cloud · Open-source

90
$1.04 / $6.24262K24.7Apr 2026
54

OpenAI · Long context

89
1310120 t/s$2 / $81M17.8Apr 2025
55

Moonshot AI · Speed & cost

89
1452$0.4 / $1.9262K77.4Jan 2026
56

· Open-weight agentic coding

89
145580 t/s$0.6 / $2.41M59.3Jun 2026
57

· Open-weight agentic coding (provisional)

89
$0.98 / $3.08200K43.8Jun 2026
58

· Open-weight agentic & tool use

88
146748 t/s$0.98 / $3.08200K43.3Apr 2026
59

OpenAI · Multimodal

88
$10 / $10400K8.8Oct 2025
60

OpenAI · Complex analysis

88
$15 / $120400K1.3Oct 2025
61

Anthropic · General purpose

88
$3 / $151M9.8Sep 2025
62

OpenAI · General purpose

88
$2.5 / $10128K14.1Aug 2025
63

OpenAI · Search + citations

88
$2.5 / $10128K14.1Mar 2025
64

OpenAI · Hard reasoning

88
$15 / $60200K2.3Dec 2024
65

OpenAI · General purpose

88
$2.5 / $10128K14.1Nov 2024
66

OpenAI · General purpose

88
$2.5 / $10128K14.1May 2024
67

OpenAI · Multimodal

88
$6 / $18128K7.3May 2024
68

OpenAI · General purpose

88
$5 / $15128K8.8May 2024
69

OpenAI · Multimodal

88
$10 / $30128K4.4Apr 2024
70

OpenAI · Complex analysis

88
$10 / $30128K4.4Jan 2024
71

OpenAI · Multimodal

88
$10 / $30128K4.4Nov 2023
72

· Open-source

88
1450$0.6 / $1.9280K69.8Feb 2026
73

Anthropic · Coding & balance

88
132095 t/s$3 / $15200K9.8May 2025
74

OpenAI · Reasoning & math

88
1305155 t/s$1.1 / $4.4200K32.0Jan 2025
75

xAI · Real-time info

87
133082 t/s$3 / $15131K9.7Feb 2025
76

DeepSeek · Open-source

87
1455$0.252 / $0.378164K276.2Dec 2025
77

· Open-source

86
$0.135 / $0.5131K270.9Dec 2025
78

DeepSeek · Open-source

86
$0.287 / $0.431164K239.6Dec 2025
79

DeepSeek · Open-source

86
$0.27 / $0.41164K252.9Sep 2025
80

DeepSeek · Open-source

86
$0.27 / $0.95164K141.0Sep 2025
81

DeepSeek · Open-source

86
$0.21 / $0.7933K172.0Aug 2025
82

DeepSeek · Open-source

86
$0.2 / $0.77164K177.3Mar 2025
83

Anthropic · General purpose

86
$3 / $15200K9.6Feb 2025
84

Anthropic · Hard reasoning

86
$3 / $15200K9.6Feb 2025
85

DeepSeek · Best open-source value

86
131062 t/s$0.27 / $1.1128K125.5Mar 2025
86

Alibaba Cloud · Multilingual & APAC

86
1448124 t/s$1.4 / $5.6256K24.6Apr 2026
87

OpenAI · General purpose

85
1285109 t/s$2.5 / $10128K13.6May 2024
88

Mistral AI · Open-source

85
$0.5 / $1.5262K85.0Dec 2025
89

Mistral AI · Open-source

85
$2 / $6131K21.3Nov 2024
90

Mistral AI · Open-source

85
$2 / $6128K21.3Feb 2024
91

Google · Speed & cost

84
$1.5 / $91M16.0May 2026
92

· Accessible open-weight agentics

84
110 t/s$0.05 / $0.2262K672.0Jun 2026
93

OpenAI · Speed & cost

83
$0.75 / $4.5400K31.6Mar 2026
94

OpenAI · Speed & cost

83
$0.25 / $2400K73.8Aug 2025
95

Alibaba Cloud · Open-source

82
$0.04 / $0.15256K863.2Mar 2026
96

Alibaba Cloud · Open-source

82
$0.139 / $1262K144.0Feb 2026
97

Alibaba Cloud · Open-source

82
$0.195 / $1.56262K93.4Feb 2026
98

Alibaba Cloud · Open-source

82
$0.26 / $2.08262K70.1Feb 2026
99

Alibaba Cloud · Speed & cost

82
$0.065 / $0.261M504.6Feb 2026
100

Alibaba Cloud · Open-source

82
$0.26 / $1.561M90.1Feb 2026
Page 1 of 4 · 1100 of 357
Quality = composite benchmark (MMLU, HumanEval, MATH)Arena ELO = LMSYS Chatbot Arena ratingValue = quality per dollarPrice = input / output per 1M tokens

What to do this quarter

  1. Treat Arena Elo as a triage filter, not a decision. Use it to drop the bottom half of your candidate list, then run a real eval on the remainder.
  2. Pick the right Arena board. Coding teams should read the coding Arena (Claude Opus 4.8 now leads at ~1582 Elo, ahead of Opus 4.7 at 1567). Long-context teams should read the hard-prompts Arena. The aggregate text leaderboard is the wrong signal for many enterprise workloads.
  3. Discount short-conversation polish. The Arena rewards style. Models tuned for chat win at the margin against models tuned for accuracy. Build internal evals that reward what your business actually pays for.
  4. Watch the gap, not the ranking. Sub-25 Elo shifts are within statistical noise. Anything under 50 Elo between two candidates is a coin flip on most workloads.
  5. Plan for the multi-way race. Claude Opus 4.8 holds a narrow lead, but Gemini 3.1 Pro, Claude Opus 4.7, GPT-5.5 Pro, Qwen 3.7 Max, and DeepSeek V4 Pro are approximately interchangeable on quality at the top. Optimise your stack for switching cost, not for capability.
  6. Capture vote-rate momentum. The fastest-rising models week-over-week are usually the next month's leaders. Subscribe to weekly Arena reports.
  7. Pair Arena Elo with cost. A 50-Elo lead at 10x the price is rarely a good trade. See our model leaderboard for combined quality-cost rankings.

Related reading

For teams running multiple top-of-Arena models in production, Swfte Connect provides a single OpenAI-compatible endpoint that routes across providers and normalises Arena-tier quality without re-architecting your stack.