Compare/GPT-4o mini vs Gemini 2.0 Flash (Feb '25)

GPT-4o minivsGemini 2.0 Flash (Feb '25)

Side-by-side comparison of pricing, 12 benchmarks, and generation speed.

OpenAI

GPT-4o mini

Input
$0.15/M
Output
$0.6/M
Speed
67 tok/s
TTFT
0.52s
Google

Gemini 2.0 Flash (Feb '25)

Input
$0.15/M
Output
$0.6/M
Speed
TTFT

Winner by Category

Cheaper
Tie
Faster (tok/s)
GPT-4o mini
Lower Latency
Gemini 2.0 Flash (Feb '25)
Benchmarks (0-12)
Gemini 2.0 Flash (Feb '25)

Pricing Comparison

MetricGPT-4o miniGemini 2.0 Flash (Feb '25)
Input ($/M tokens)$0.15$0.15
Output ($/M tokens)$0.6$0.6
Cost for 1M input + 100K output tokens:
GPT-4o mini$0.21
Gemini 2.0 Flash (Feb '25)$0.21

Speed Comparison

Output Speed (tokens/s) — higher is better
GPT-4o mini
67 tok/s
Gemini 2.0 Flash (Feb '25)
Time to First Token (seconds) — lower is better
GPT-4o mini
0.52s
Gemini 2.0 Flash (Feb '25)

Benchmark Comparison

Data from Artificial Analysis API — 12 benchmarks

Intelligence Index
12.618.5
Coding Index
13.6
Math Index
14.721.7
GPQA Diamond
42.6%62.3%
MMLU-Pro
64.8%77.9%
LiveCodeBench
23.4%33.4%
AIME 2025
14.7%21.7%
MATH-500
78.9%93.0%
Humanity's Last Exam
4.0%5.3%
SciCode
22.9%33.3%
IFBench
31.0%40.2%
TerminalBench
3.8%
GPT-4o mini0 wins
12 winsGemini 2.0 Flash (Feb '25)

Frequently Asked Questions

Which is cheaper, GPT-4o mini or Gemini 2.0 Flash (Feb '25)?

Both models have similar pricing. Check the detailed breakdown above for input vs output token costs.

Which model performs better on benchmarks?

Gemini 2.0 Flash (Feb '25) wins 12 out of 12 benchmarks compared to 0 for GPT-4o mini. See the detailed benchmark chart above for per-category results.

Which is faster for real-time applications?

GPT-4o mini generates tokens faster at 67 tok/s vs 0 tok/s. However, Gemini 2.0 Flash (Feb '25) has lower time-to-first-token (0.00s vs 0.52s).

When should I use GPT-4o mini vs Gemini 2.0 Flash (Feb '25)?

Choose based on your priorities: both are similarly priced, Gemini 2.0 Flash (Feb '25) for stronger benchmark performance, and GPT-4o mini for faster generation. For latency-sensitive apps, check the TTFT comparison above.