Live speed data from Artificial Analysis API

AI Model Speed Rankings

Compare 446+ AI models by response speed, latency, and throughput. Find the fastest models for your use case.

446 models · click headers to sort
#
Model
Throughput
TTFT
$/1M
$/Speed
Price×TTFT
1
Mercury 2
Inception
907 t/s
3.76s
$0.38
$0.000
1409.3
2
524 t/s
8.68s
$0.11
$0.000
928.7
3
386 t/s
7.20s
$0.09
$0.000
612.3
4
365 t/s
3.39s
$0.17
$0.000
592.6
5
363 t/s
549ms
$0.41
$0.001
226.2
7
328 t/s
250ms
$0.17
$0.001
43.8
8
320 t/s
451ms
$0.09
$0.000
42.4
9
318 t/s
18.42s
$0.17
$0.001
3223.3
10
316 t/s
433ms
$0.09
$0.000
40.7
11
305 t/s
351ms
$0.06
$0.000
21.4
12
301 t/s
345ms
$0.15
$0.000
51.8
13
299 t/s
258ms
$0.10
$0.000
25.8
14
253 t/s
503ms
$0.26
$0.001
132.3
15
252 t/s
486ms
$0.26
$0.001
127.8
16
238 t/s
10.94s
$3.0
$0.013
32811.0
17
229 t/s
670ms
$0.85
$0.004
569.5
18
226 t/s
12.02s
$0.85
$0.004
10217.9
19
224 t/s
3.55s
$0.85
$0.004
3014.1
20
219 t/s
3.82s
$1.7
$0.008
6443.1
21
216 t/s
12.05s
$3.4
$0.016
41421.0
22
216 t/s
7.25s
$0.56
$0.003
4080.6
23
LFM2 24B A2B
Liquid AI
213 t/s
224ms
$0.05
$0.000
11.6
24
212 t/s
404ms
$0.85
$0.004
343.4
25
211 t/s
848ms
$0.19
$0.001
159.4
26
206 t/s
10.32s
$0.85
$0.004
8776.3
27
205 t/s
872ms
$0.40
$0.002
347.1
28
202 t/s
476ms
$1.7
$0.008
803.5
29
200 t/s
3.77s
$1.7
$0.008
6372.2
30
Nova Lite
Amazon
200 t/s
399ms
$0.10
$0.001
41.9
31
196 t/s
313ms
$0.15
$0.001
46.9
32
196 t/s
390ms
$0.35
$0.002
136.5
33
195 t/s
343ms
$0.00
34
193 t/s
474ms
$0.10
$0.001
47.4
35
192 t/s
279ms
$0.15
$0.001
41.9
36
190 t/s
1.23s
$1.1
$0.006
1378.1
37
189 t/s
309ms
$3.0
$0.016
927.0
38
188 t/s
5.81s
$0.69
$0.004
3998.0
39
187 t/s
274ms
$0.25
$0.001
68.5
40
186 t/s
2.79s
$0.53
$0.003
1466.8
41
184 t/s
607ms
$0.25
$0.001
151.8
42
184 t/s
327ms
$0.75
$0.004
245.3
43
184 t/s
417ms
$0.30
$0.002
125.1
44
182 t/s
414ms
$0.15
$0.001
62.1
45
180 t/s
6.08s
$1.1
$0.006
6841.1
46
178 t/s
2.94s
$0.46
$0.003
1363.5
47
178 t/s
1.21s
$0.00
48
177 t/s
466ms
$0.46
$0.003
215.8
49
175 t/s
564ms
$0.85
$0.005
479.4
50
174 t/s
411ms
$1.5
$0.009
616.5
51
169 t/s
2.65s
$0.46
$0.003
1227.9
52
165 t/s
983ms
$0.60
$0.004
589.8
53
164 t/s
41.55s
$0.14
$0.001
5733.6
54
163 t/s
414ms
$0.15
$0.001
62.1
55
163 t/s
408ms
$0.26
$0.002
107.3
56
163 t/s
473ms
$3.4
$0.021
1626.2
57
159 t/s
446ms
$0.17
$0.001
78.0
58
158 t/s
775ms
$0.14
$0.001
107.0
59
154 t/s
570ms
$0.09
$0.001
49.0
60
154 t/s
14.17s
$3.4
$0.022
48713.0
61
154 t/s
972ms
$0.88
$0.006
850.5
62
o3-mini
OpenAI
153 t/s
7.38s
$1.9
$0.013
14212.3
63
151 t/s
80.65s
$0.14
$0.001
11129.1
64
151 t/s
1.01s
$1.9
$0.012
1903.1
65
149 t/s
5.32s
$3.4
$0.023
18273.0
66
148 t/s
997ms
$1.1
$0.007
1096.7
67
148 t/s
657ms
$3.4
$0.023
2258.8
68
147 t/s
23.74s
$1.9
$0.013
45705.3
69
146 t/s
186ms
$0.00
70
143 t/s
26.75s
$1.9
$0.013
51491.8
71
142 t/s
972ms
$0.75
$0.005
729.0
72
141 t/s
1.05s
$0.31
$0.002
327.0
73
140 t/s
879ms
$0.40
$0.003
349.8
74
140 t/s
1.31s
$0.00
75
139 t/s
993ms
$0.19
$0.001
186.7
76
139 t/s
203ms
$0.00
77
139 t/s
406ms
$0.80
$0.006
324.8
78
139 t/s
9.86s
$2.0
$0.014
19718.0
79
139 t/s
7.13s
$3.4
$0.025
24509.5
80
139 t/s
8.22s
$0.28
$0.002
2259.4
81
138 t/s
316ms
$0.28
$0.002
86.9
82
138 t/s
1.00s
$0.10
$0.001
105.1
83
138 t/s
526ms
$0.30
$0.002
157.8
84
135 t/s
1.01s
$0.66
$0.005
666.6
85
134 t/s
1.02s
$1.1
$0.008
1127.5
86
133 t/s
459ms
$0.50
$0.004
229.5
87
Apertus 8B Instruct
Swiss AI Initiative
133 t/s
1.84s
$0.13
$0.001
230.6
88
132 t/s
229ms
$0.30
$0.002
68.7
89
131 t/s
1.60s
$0.15
$0.001
240.8
90
131 t/s
21.11s
$3.4
$0.026
72562.4
91
130 t/s
1.31s
$0.15
$0.001
196.6
92
130 t/s
992ms
$0.75
$0.006
744.0
93
129 t/s
3.24s
$0.00
94
129 t/s
965ms
$0.35
$0.003
337.8
95
129 t/s
451ms
$0.29
$0.002
131.7
96
127 t/s
475ms
$0.49
$0.004
231.3
97
127 t/s
246ms
$0.07
$0.001
17.2
98
125 t/s
1.49s
$0.15
$0.001
223.8
99
123 t/s
9.13s
$4.8
$0.039
43918.6
100
123 t/s
846ms
$0.10
$0.001
84.6
Showing top 100 of 446 models. Use search/filter to narrow down.

Speed Metrics Guide

Throughput (tokens/s)

Output generation speed in tokens per second. Higher is better.

Good: >50 t/s · Excellent: >100 t/s
Time to First Token (TTFT)

Delay before the first token appears. Lower is better.

Good: <500ms · Excellent: <200ms
Price/Performance

Cost efficiency ratios. Lower values indicate better value.

$/Speed: price per t/s · Price×TTFT: latency penalty

Compare pricing for all models side by side

Open AI API Cost Calculator →