Live data from Artificial Analysis API

AI Model Benchmarks

Compare 510+ AI models across 12 benchmarks — Intelligence, Coding, Math, Science, and more. Data updated hourly.

Benchmarks:
510 models · click headers to sort
#
Model
Speed
$/1M
AA Index
GPQA
MMLU-Pro
LiveCode
AIME
HLE
1
63 t/s
$11.3
60.2
93.5%
44.3%
2
61 t/s
$11.3
58.9
93.2%
43.0%
3
64 t/s
$10.9
57.3
91.4%
39.6%
4
133 t/s
$4.5
57.2
94.1%
44.7%
5
80 t/s
$5.6
56.8
92.0%
41.6%
6
61 t/s
$11.3
56.7
92.6%
40.6%
7
40 t/s
$1.7
53.9
91.1%
35.9%
8
55 t/s
$1.5
53.8
86.6%
33.8%
9
80 t/s
$4.8
53.6
91.5%
39.9%
10
86 t/s
$1.6
53.2
90.1%
35.0%
11
49 t/s
$10.9
52.9
89.6%
36.7%
12
$0.00
52.2
88.4%
39.9%
13
62 t/s
$10.9
51.8
88.5%
31.2%
14
36 t/s
$2.9
51.8
88.8%
28.9%
15
62 t/s
$6.6
51.7
87.5%
30.0%
16
31 t/s
$2.2
51.5
88.8%
35.9%
17
53 t/s
$2.1
51.4
86.8%
28.0%
18
69 t/s
$4.8
51.3
90.3%
87.4%
88.9%
99.0%
35.4%
19
58 t/s
$11.3
50.8
91.0%
31.0%
20
52 t/s
$1.1
50.0
88.2%
25.7%
21
30 t/s
$2.2
49.8
90.5%
33.5%
22
76 t/s
$1.6
49.8
82.0%
27.2%
23
62 t/s
$10.9
49.7
86.6%
89.5%
87.1%
91.3%
28.4%
24
51 t/s
$0.53
49.6
87.4%
28.1%
25
90 t/s
$3.0
49.3
91.1%
32.2%
26
67 t/s
$1.5
49.2
87.0%
28.3%
27
MiMo-V2.5
Xiaomi
88 t/s
$0.72
49.0
84.9%
25.2%
28
94 t/s
$4.8
49.0
89.9%
33.5%
29
178 t/s
$1.7
48.9
87.5%
26.6%
30
90 t/s
$3.0
48.5
88.5%
30.0%
31
125 t/s
$4.5
48.4
90.8%
89.8%
91.7%
95.7%
37.2%
32
63 t/s
$5.6
47.9
87.1%
28.9%
33
118 t/s
$3.4
47.7
87.3%
87.0%
86.8%
94.0%
26.5%
34
$0.00
46.8
84.7%
25.4%
35
49 t/s
$1.1
46.8
87.9%
29.4%
36
$4.8
46.6
86.4%
85.9%
89.4%
96.7%
24.9%
37
71 t/s
$0.17
46.5
89.4%
32.1%
38
44 t/s
$10.9
46.5
84.0%
18.6%
39
197 t/s
$1.1
46.4
89.8%
89.0%
90.8%
97.0%
34.7%
40
$0.17
46.0
86.7%
27.8%
41
64 t/s
$1.4
45.8
84.2%
21.6%
42
52 t/s
$1.4
45.0
89.3%
27.3%
43
107 t/s
$0.80
44.9
85.5%
20.4%
44
171 t/s
$3.4
44.6
83.7%
86.5%
84.0%
98.7%
25.6%
45
77 t/s
$3.4
44.6
85.4%
87.1%
84.6%
94.3%
26.5%
46
49 t/s
$6.6
44.4
79.9%
13.2%
47
147 t/s
$0.46
44.0
81.7%
26.5%
48
114 t/s
$0.53
43.8
85.5%
16.0%
49
42 t/s
$2.1
43.8
83.9%
25.6%
50
183 t/s
$0.56
43.5
84.1%
20.2%
51
105 t/s
$0.00
43.4
82.8%
19.9%
52
183 t/s
$3.4
43.1
86.0%
86.0%
84.9%
95.7%
23.4%
53
51 t/s
$10.9
43.1
81.0%
88.9%
73.8%
62.7%
12.9%
54
47 t/s
$6.6
43.0
83.4%
87.5%
71.4%
88.0%
17.3%
55
37 t/s
$1.7
42.9
78.8%
18.2%
56
$0.00
42.9
80.9%
15.8%
57
52 t/s
$6.6
42.6
79.7%
10.8%
58
107 t/s
$1.0
42.1
85.9%
85.6%
89.4%
95.0%
25.1%
59
90 t/s
$0.82
42.1
85.8%
22.2%
60
76 t/s
$3.4
42.0
84.2%
86.7%
70.3%
91.7%
23.5%
61
37 t/s
$32.8
42.0
80.9%
88.0%
65.4%
80.3%
11.9%
62
159 t/s
$0.00
41.9
86.7%
25.5%
63
89 t/s
$0.53
41.9
84.8%
19.1%
64
$0.34
41.7
84.0%
86.2%
86.2%
92.0%
22.2%
65
158 t/s
$1.1
41.6
85.7%
23.4%
66
137 t/s
$0.15
41.5
83.5%
20.0%
67
44 t/s
$8.5
41.5
87.7%
86.6%
81.9%
92.7%
23.9%
68
$4.5
41.3
88.7%
89.5%
85.7%
86.7%
27.6%
69
86 t/s
$0.69
41.2
82.8%
83.7%
83.8%
90.7%
19.7%
70
57 t/s
$11.3
40.9
76.8%
12.6%
71
114 t/s
$1.1
40.9
83.8%
84.8%
85.3%
94.7%
22.3%
72
o3-pro
OpenAI
26 t/s
$35.0
40.7
84.5%
73
67 t/s
$1.6
40.6
66.6%
7.2%
74
52 t/s
$1.4
40.1
86.1%
18.8%
75
49 t/s
$2.4
39.8
86.1%
26.2%
76
87 t/s
$0.53
39.4
83.0%
87.5%
81.0%
82.7%
22.2%
77
31 t/s
$2.2
39.3
71.7%
7.7%
78
35 t/s
$0.00
39.2
85.7%
22.7%
79
153 t/s
$3.0
39.2
74.8%
12.8%
80
72 t/s
$3.4
39.2
80.8%
86.0%
76.3%
83.0%
18.4%
81
142 t/s
$0.15
39.2
84.6%
84.3%
86.8%
96.3%
21.1%
82
40 t/s
$32.8
39.0
79.6%
87.3%
63.6%
73.3%
11.7%
83
77 t/s
$0.69
38.9
80.3%
82.8%
69.2%
85.0%
14.6%
84
50 t/s
$6.6
38.7
77.7%
84.2%
65.5%
74.3%
9.6%
85
98 t/s
$0.28
38.6
85.3%
85.4%
82.2%
89.3%
17.6%
86
56 t/s
$1.5
38.6
82.6%
13.9%
87
176 t/s
$0.69
38.6
81.3%
82.0%
83.6%
91.7%
16.9%
88
164 t/s
$0.00
38.5
82.6%
22.6%
89
o3
OpenAI
95 t/s
$3.5
38.4
82.7%
85.3%
80.8%
88.3%
20.0%
90
156 t/s
$0.46
38.1
76.1%
14.7%
91
154 t/s
$0.15
37.8
83.1%
19.1%
92
172 t/s
$1.7
37.7
82.3%
17.1%
93
59 t/s
$1.2
37.3
78.9%
12.3%
94
89 t/s
$0.83
37.2
84.2%
13.2%
95
139 t/s
$2.2
37.1
67.2%
76.0%
61.5%
83.7%
9.7%
96
64 t/s
$1.4
37.1
82.9%
13.6%
97
45 t/s
$6.6
37.1
72.7%
86.0%
59.0%
37.0%
7.1%
98
119 t/s
$0.69
37.1
84.5%
19.7%
99
72 t/s
$0.17
36.5
71.6%
7.0%
100
JT-35B-Flash
China Mobile
$0.00
36.1
82.9%
6.1%
Showing top 100 of 510 models. Use search/filter to narrow down.

Benchmark Guide

Intelligence
Source ↗

Composite score across math, science, coding

Graduate-level science Q&A (Diamond)

MMLU-Pro
Source ↗

Knowledge & reasoning across 57 subjects

LiveCodeBench
Source ↗

Live coding benchmark with new problems

AIME 2025
Source ↗

American Invitational Math Exam

MATH-500
Source ↗

Competition-level math problems

Humanity's Last Exam - hardest questions

Composite coding benchmark score

Composite math benchmark score

SciCode
Source ↗

Scientific coding problems

IFBench
Source ↗

Instruction following benchmark

TerminalBench
Source ↗

Terminal/CLI task completion

Compare pricing for all models side by side

Open AI API Cost Calculator →