Compare/Kimi K2 vs Nova 2.0 Lite (medium)

Kimi K2vsNova 2.0 Lite (medium)

Side-by-side comparison of pricing, 12 benchmarks, and generation speed.

Kimi

Kimi K2

Input
$0.585/M
Output
$2.4/M
Speed
29 tok/s
TTFT
1.45s
Amazon

Nova 2.0 Lite (medium)

Input
$0.3/M
Output
$2.5/M
Speed
162 tok/s
TTFT
19.04s

Winner by Category

Cheaper
Nova 2.0 Lite (medium)
Faster (tok/s)
Nova 2.0 Lite (medium)
Lower Latency
Kimi K2
Benchmarks (2-10)
Nova 2.0 Lite (medium)

Pricing Comparison

MetricKimi K2Nova 2.0 Lite (medium)
Input ($/M tokens)$0.585$0.3
Output ($/M tokens)$2.4$2.5
Cost for 1M input + 100K output tokens:
Kimi K2$0.82
Nova 2.0 Lite (medium)$0.55

Speed Comparison

Output Speed (tokens/s) — higher is better
Kimi K2
29 tok/s
Nova 2.0 Lite (medium)
162 tok/s
Time to First Token (seconds) — lower is better
Kimi K2
1.45s
Nova 2.0 Lite (medium)
19.04s

Benchmark Comparison

Data from Artificial Analysis API — 12 benchmarks

Intelligence Index
26.329.7
Coding Index
22.123.9
Math Index
57.088.7
GPQA Diamond
76.6%76.8%
MMLU-Pro
82.4%81.3%
LiveCodeBench
55.6%66.3%
AIME 2025
57.0%88.7%
MATH-500
97.1%
Humanity's Last Exam
7.0%8.6%
SciCode
34.5%36.8%
IFBench
41.5%68.5%
TerminalBench
15.9%17.4%
Kimi K22 wins
10 winsNova 2.0 Lite (medium)

Frequently Asked Questions

Which is cheaper, Kimi K2 or Nova 2.0 Lite (medium)?

Nova 2.0 Lite (medium) is cheaper overall. Its blended price (3:1 input/output ratio) is $0.85/M tokens vs $1.04/M for Kimi K2.

Which model performs better on benchmarks?

Nova 2.0 Lite (medium) wins 10 out of 12 benchmarks compared to 2 for Kimi K2. See the detailed benchmark chart above for per-category results.

Which is faster for real-time applications?

Nova 2.0 Lite (medium) generates tokens faster at 162 tok/s vs 29 tok/s. Kimi K2 also has lower time-to-first-token (1.45s vs 19.04s).

When should I use Kimi K2 vs Nova 2.0 Lite (medium)?

Choose based on your priorities: Nova 2.0 Lite (medium) for lower cost, Nova 2.0 Lite (medium) for stronger benchmark performance, and Nova 2.0 Lite (medium) for faster generation. For latency-sensitive apps, check the TTFT comparison above.