How much does the OpenAI GPT-5.4 API cost?

GPT-5.4 API pricing is $2.50 per million input tokens and $15.00 per million output tokens. Use our calculator at aiapicost.com for exact cost estimates based on your usage.

Which AI model is cheapest for API usage?

The cheapest AI API models change frequently. Use aiapicost.com to compare real-time pricing across 400+ models from OpenAI, Anthropic, Google, DeepSeek, and more. DeepSeek and open-source models typically offer the lowest per-token costs.

How do AI API token costs work?

AI APIs charge per token (roughly 0.75 words). Costs are split into input tokens (what you send) and output tokens (what the model generates). Output tokens are typically 2-5x more expensive. Prices are quoted per 1 million tokens.

Claude vs ChatGPT: which is better?

Both are top-tier models. Claude excels at coding and instruction-following, while GPT-5.4 offers broader multimodal capabilities. Compare them head-to-head at aiapicost.com/compare with real benchmark data.

Which performs better on benchmarks, Gemma 4 31B (Reasoning) or Gemini 3 Pro Preview (high)?

Gemini 3 Pro Preview (high) wins 9 out of 12 benchmarks vs 2 for Gemma 4 31B (Reasoning).

Compare/Gemma 4 31B (Reasoning) vs Gemini 3 Pro Preview (high)

Gemma 4 31B (Reasoning)vsGemini 3 Pro Preview (high)

Side-by-side comparison of pricing, 12 benchmarks, and generation speed.

Google

Gemma 4 31B (Reasoning)

Input

$0/M

Output

$0/M

Speed

36 tok/s

TTFT

1.02s

Google

Gemini 3 Pro Preview (high)

Input

$2/M

Output

$12/M

Speed

—

TTFT

—

Winner by Category

Cheaper

Gemma 4 31B (Reasoning)

Faster (tok/s)

Gemma 4 31B (Reasoning)

Lower Latency

Gemini 3 Pro Preview (high)

Benchmarks (2-9)

Gemini 3 Pro Preview (high)

Pricing Comparison

Metric	Gemma 4 31B (Reasoning)	Gemini 3 Pro Preview (high)
Input ($/M tokens)	$0	$2
Output ($/M tokens)	$0	$12

Cost for 1M input + 100K output tokens:

Gemma 4 31B (Reasoning)$0.00

Gemini 3 Pro Preview (high)$3.20

Speed Comparison

Output Speed (tokens/s) — higher is better

Gemma 4 31B (Reasoning)

36 tok/s

Gemini 3 Pro Preview (high)

—

Time to First Token (seconds) — lower is better

Gemma 4 31B (Reasoning)

1.02s

Gemini 3 Pro Preview (high)

—

Editorial Analysis

Verdict. Gemini 3 Pro Preview (high) takes the aggregate benchmark matchup 9–2 across 11 categories. Real workloads usually care about a handful of specific tasks — see the per-benchmark table above.

Pricing. Pricing varies significantly between these models — check the table above for the exact per-token rates. Many production workloads actually surface input-token cost (retrieval-augmented prompts, code-context windows), so factor both directions.

Strengths. Gemma 4 31B (Reasoning) is strongest on GPQA Diamond (86%), IFBench (76%), Coding Index (43.4). Gemini 3 Pro Preview (high) leads on Math Index (95.7), AIME 2025 (96%), LiveCodeBench (92%).

Speed. On throughput, Gemma 4 31B (Reasoning) generates tokens at 36 tok/s versus 0 tok/s — about 100% faster. On time-to-first-token, Gemini 3 Pro Preview (high) responds in 0ms vs 1016ms, which matters most for chat-style UIs.

Provider. Both models come from the same vendor, so the choice comes down to which tier or generation fits your workload — not vendor lock-in.

Workload cost. Workload scenarios (per million requests at 30M input + 15M output tokens): Gemma 4 31B (Reasoning) costs $0.00 ($0/year); Gemini 3 Pro Preview (high) costs $240.00 ($2880/year). At a smaller 5M-input/2M-output scale (single-developer tool or prototype): Gemma 4 31B (Reasoning) ≈ $0.00/run, Gemini 3 Pro Preview (high) ≈ $34.00/run. At agent/realtime scale (200M input / 100M output per million requests): Gemma 4 31B (Reasoning) ≈ $0/run, Gemini 3 Pro Preview (high) ≈ $1600/run. Gemma 4 31B (Reasoning) becomes more attractive at higher volume — the absolute per-token pricing difference compounds when you ship at scale.

Recommendation. Both models have legitimate use cases — the right answer depends on whether you are optimizing for benchmark ceiling, latency, or unit cost. Start with the cheaper / faster model, evaluate against your specific task, and only switch if the upgrade shows a meaningful lift.

Head-to-head deltas

Gemini 3 Pro Preview (high) wins 7 more benchmarks than its opponent — a margin wide enough to call the comparison settled on benchmark terms alone.
On throughput, Gemma 4 31B (Reasoning) is 3568.60× faster (36 tok/s vs 0 tok/s). For streaming chat or real-time agents this alone often flips the recommendation.
Time-to-first-token differs by 10160.0× — Gemini 3 Pro Preview (high) responds in 0ms vs 1016ms. For interactive chat UIs this can matter more than raw benchmark wins.

Benchmark Comparison

Data from Artificial Analysis API — 12 benchmarks

Intelligence Index

29.439.6

Coding Index

43.4—

Math Index

—95.7

GPQA Diamond

85.7%90.8%

MMLU-Pro

—89.8%

LiveCodeBench

—91.7%

AIME 2025

—95.7%

MATH-500

——

Humanity's Last Exam

22.7%37.2%

SciCode

43.4%56.1%

IFBench

75.6%70.4%

TerminalBench

36.4%41.7%

Gemma 4 31B (Reasoning)2 wins

9 winsGemini 3 Pro Preview (high)

Frequently Asked Questions

Which is cheaper, Gemma 4 31B (Reasoning) or Gemini 3 Pro Preview (high)?

Gemma 4 31B (Reasoning) is cheaper overall. Its blended price (3:1 input/output ratio) is $0.00/M tokens vs $4.50/M for Gemini 3 Pro Preview (high).

Which model performs better on benchmarks?

Gemini 3 Pro Preview (high) wins 9 out of 12 benchmarks compared to 2 for Gemma 4 31B (Reasoning). See the detailed benchmark chart above for per-category results.

Which is faster for real-time applications?

Gemma 4 31B (Reasoning) generates tokens faster at 36 tok/s vs 0 tok/s. However, Gemini 3 Pro Preview (high) has lower time-to-first-token (0.00s vs 1.02s).

When should I use Gemma 4 31B (Reasoning) vs Gemini 3 Pro Preview (high)?

Choose based on your priorities: Gemma 4 31B (Reasoning) for lower cost, Gemini 3 Pro Preview (high) for stronger benchmark performance, and Gemma 4 31B (Reasoning) for faster generation. For latency-sensitive apps, check the TTFT comparison above.

More Comparisons

← All comparisons·Full benchmark table·Cost calculator