How much does the OpenAI GPT-5.4 API cost?

GPT-5.4 API pricing is $2.50 per million input tokens and $15.00 per million output tokens. Use our calculator at aiapicost.com for exact cost estimates based on your usage.

Which AI model is cheapest for API usage?

The cheapest AI API models change frequently. Use aiapicost.com to compare real-time pricing across 400+ models from OpenAI, Anthropic, Google, DeepSeek, and more. DeepSeek and open-source models typically offer the lowest per-token costs.

How do AI API token costs work?

AI APIs charge per token (roughly 0.75 words). Costs are split into input tokens (what you send) and output tokens (what the model generates). Output tokens are typically 2-5x more expensive. Prices are quoted per 1 million tokens.

Claude vs ChatGPT: which is better?

Both are top-tier models. Claude excels at coding and instruction-following, while GPT-5.4 offers broader multimodal capabilities. Compare them head-to-head at aiapicost.com/compare with real benchmark data.

Which performs better on benchmarks, gpt-oss-120b (high) or Gemini 3 Pro Preview (high)?

Gemini 3 Pro Preview (high) wins 10 out of 12 benchmarks vs 1 for gpt-oss-120b (high).

Compare/gpt-oss-120b (high) vs Gemini 3 Pro Preview (high)

gpt-oss-120b (high)vsGemini 3 Pro Preview (high)

Side-by-side comparison of pricing, 12 benchmarks, and generation speed.

OpenAI

gpt-oss-120b (high)

Input

$0.15/M

Output

$0.6/M

Speed

217 tok/s

TTFT

0.54s

Google

Gemini 3 Pro Preview (high)

Input

$2/M

Output

$12/M

Speed

—

TTFT

—

Winner by Category

Cheaper

gpt-oss-120b (high)

Faster (tok/s)

gpt-oss-120b (high)

Lower Latency

Gemini 3 Pro Preview (high)

Benchmarks (1-10)

Gemini 3 Pro Preview (high)

Pricing Comparison

Metric	gpt-oss-120b (high)	Gemini 3 Pro Preview (high)
Input ($/M tokens)	$0.15	$2
Output ($/M tokens)	$0.6	$12

Cost for 1M input + 100K output tokens:

gpt-oss-120b (high)$0.21

Gemini 3 Pro Preview (high)$3.20

Speed Comparison

Output Speed (tokens/s) — higher is better

gpt-oss-120b (high)

217 tok/s

Gemini 3 Pro Preview (high)

—

Time to First Token (seconds) — lower is better

gpt-oss-120b (high)

0.54s

Gemini 3 Pro Preview (high)

—

Editorial Analysis

Verdict. Gemini 3 Pro Preview (high) takes the aggregate benchmark matchup 10–1 across 11 categories. Real workloads usually care about a handful of specific tasks — see the per-benchmark table above.

Pricing. Both models sit in the budget / mid-tier bracket for output-token pricing. At 0.0× the per-million-token cost, gpt-oss-120b (high) is meaningfully cheaper if your traffic is output-heavy (long completions, document generation, agent loops). gpt-oss-120b (high) makes more sense when output volume is low and absolute reasoning quality justifies the premium.

Strengths. gpt-oss-120b (high) is strongest on AIME 2025 (93%), Math Index (93.4), LiveCodeBench (88%). Gemini 3 Pro Preview (high) leads on Math Index (95.7), AIME 2025 (96%), LiveCodeBench (92%).

Speed. On throughput, gpt-oss-120b (high) generates tokens at 217 tok/s versus 0 tok/s — about 100% faster. On time-to-first-token, Gemini 3 Pro Preview (high) responds in 0ms vs 535ms, which matters most for chat-style UIs.

Provider. OpenAI and Google sell to overlapping but distinct developer audiences: OpenAI tends to ship frontier reasoning models with premium positioning, while Google often prices more aggressively. Your existing vendor relationships, billing, and SLA preferences may matter as much as the raw numbers above.

Workload cost. Workload scenarios (per million requests at 30M input + 15M output tokens): gpt-oss-120b (high) costs $13.50 ($162/year); Gemini 3 Pro Preview (high) costs $240.00 ($2880/year). At a smaller 5M-input/2M-output scale (single-developer tool or prototype): gpt-oss-120b (high) ≈ $1.95/run, Gemini 3 Pro Preview (high) ≈ $34.00/run. At agent/realtime scale (200M input / 100M output per million requests): gpt-oss-120b (high) ≈ $90/run, Gemini 3 Pro Preview (high) ≈ $1600/run. gpt-oss-120b (high) becomes more attractive at higher volume — the absolute per-token pricing difference compounds when you ship at scale.

Recommendation. Both models have legitimate use cases — the right answer depends on whether you are optimizing for benchmark ceiling, latency, or unit cost. Start with the cheaper / faster model, evaluate against your specific task, and only switch if the upgrade shows a meaningful lift.

Head-to-head deltas

Gemini 3 Pro Preview (high) wins 9 more benchmarks than its opponent — a margin wide enough to call the comparison settled on benchmark terms alone.
On throughput, gpt-oss-120b (high) is 21658.20× faster (217 tok/s vs 0 tok/s). For streaming chat or real-time agents this alone often flips the recommendation.
Time-to-first-token differs by 5350.0× — Gemini 3 Pro Preview (high) responds in 0ms vs 535ms. For interactive chat UIs this can matter more than raw benchmark wins.
gpt-oss-120b (high) is 20.00× cheaper per million output tokens than Gemini 3 Pro Preview (high). At scale, that price ratio is roughly your monthly bill — the deciding factor for most production teams.
Aggregate benchmark score (sum across 12 categories, capped at 100): gpt-oss-120b (high) = 153, Gemini 3 Pro Preview (high) = 141. Within 15% — effectively equivalent if both meet the threshold your product requires.

Benchmark Comparison

Data from Artificial Analysis API — 12 benchmarks

Intelligence Index

23.839.6

Coding Index

30.4—

Math Index

93.495.7

GPQA Diamond

78.2%90.8%

MMLU-Pro

80.8%89.8%

LiveCodeBench

87.8%91.7%

AIME 2025

93.4%95.7%

MATH-500

——

Humanity's Last Exam

18.5%37.2%

SciCode

38.9%56.1%

IFBench

69.0%70.4%

TerminalBench

23.5%41.7%

gpt-oss-120b (high)1 wins

10 winsGemini 3 Pro Preview (high)

Frequently Asked Questions

Which is cheaper, gpt-oss-120b (high) or Gemini 3 Pro Preview (high)?

gpt-oss-120b (high) is cheaper overall. Its blended price (3:1 input/output ratio) is $0.26/M tokens vs $4.50/M for Gemini 3 Pro Preview (high).

Which model performs better on benchmarks?

Gemini 3 Pro Preview (high) wins 10 out of 12 benchmarks compared to 1 for gpt-oss-120b (high). See the detailed benchmark chart above for per-category results.

Which is faster for real-time applications?

gpt-oss-120b (high) generates tokens faster at 217 tok/s vs 0 tok/s. However, Gemini 3 Pro Preview (high) has lower time-to-first-token (0.00s vs 0.54s).

When should I use gpt-oss-120b (high) vs Gemini 3 Pro Preview (high)?

Choose based on your priorities: gpt-oss-120b (high) for lower cost, Gemini 3 Pro Preview (high) for stronger benchmark performance, and gpt-oss-120b (high) for faster generation. For latency-sensitive apps, check the TTFT comparison above.

More Comparisons

← All comparisons·Full benchmark table·Cost calculator