Compare/gpt-oss-120B (high) vs Gemini 2.0 Flash (Feb '25)

gpt-oss-120B (high)vsGemini 2.0 Flash (Feb '25)

Side-by-side comparison of pricing, 12 benchmarks, and generation speed.

OpenAI

gpt-oss-120B (high)

Input
$0.15/M
Output
$0.6/M
Speed
208 tok/s
TTFT
0.55s
Google

Gemini 2.0 Flash (Feb '25)

Input
$0.15/M
Output
$0.6/M
Speed
TTFT

Winner by Category

Cheaper
Tie
Faster (tok/s)
gpt-oss-120B (high)
Lower Latency
Gemini 2.0 Flash (Feb '25)
Benchmarks (11-1)
gpt-oss-120B (high)

Pricing Comparison

Metricgpt-oss-120B (high)Gemini 2.0 Flash (Feb '25)
Input ($/M tokens)$0.15$0.15
Output ($/M tokens)$0.6$0.6
Cost for 1M input + 100K output tokens:
gpt-oss-120B (high)$0.21
Gemini 2.0 Flash (Feb '25)$0.21

Speed Comparison

Output Speed (tokens/s) — higher is better
gpt-oss-120B (high)
208 tok/s
Gemini 2.0 Flash (Feb '25)
Time to First Token (seconds) — lower is better
gpt-oss-120B (high)
0.55s
Gemini 2.0 Flash (Feb '25)

Benchmark Comparison

Data from Artificial Analysis API — 12 benchmarks

Intelligence Index
33.318.5
Coding Index
28.613.6
Math Index
93.421.7
GPQA Diamond
78.2%62.3%
MMLU-Pro
80.8%77.9%
LiveCodeBench
87.8%33.4%
AIME 2025
93.4%21.7%
MATH-500
93.0%
Humanity's Last Exam
18.5%5.3%
SciCode
38.9%33.3%
IFBench
69.0%40.2%
TerminalBench
23.5%3.8%
gpt-oss-120B (high)11 wins
1 winsGemini 2.0 Flash (Feb '25)

Frequently Asked Questions

Which is cheaper, gpt-oss-120B (high) or Gemini 2.0 Flash (Feb '25)?

Both models have similar pricing. Check the detailed breakdown above for input vs output token costs.

Which model performs better on benchmarks?

gpt-oss-120B (high) wins 11 out of 12 benchmarks compared to 1 for Gemini 2.0 Flash (Feb '25). See the detailed benchmark chart above for per-category results.

Which is faster for real-time applications?

gpt-oss-120B (high) generates tokens faster at 208 tok/s vs 0 tok/s. However, Gemini 2.0 Flash (Feb '25) has lower time-to-first-token (0.00s vs 0.55s).

When should I use gpt-oss-120B (high) vs Gemini 2.0 Flash (Feb '25)?

Choose based on your priorities: both are similarly priced, gpt-oss-120B (high) for stronger benchmark performance, and gpt-oss-120B (high) for faster generation. For latency-sensitive apps, check the TTFT comparison above.