OpenAI API Pricing 2026: Complete Model Cost Breakdown

OpenAI Pricing Overview
GPT-5 Series Pricing
GPT-4 Series Pricing
Reasoning Models (o3, o4)
OpenAI API Pricing 2026: Pricing Breakdown Table
Cost Per Task
Model Selection Guide
Throughput Considerations
FAQ
Throughput and Latency Implications
API Rate Limits by Model
Batch API: 50% Cheaper
Hybrid Approach: Multi-Model Strategy
Cost Trends: Will GPT-5 Pricing Drop?
Common Pricing Mistakes
Related Resources
Sources

OpenAI Pricing Overview

OpenAI's March 2026 pricing spans 17 active models across four product lines: GPT-5 series (baseline + Pro), GPT-4 series (legacy), and reasoning models (o3 series). Prices range from $0.05 per million prompt tokens (GPT-5 Nano) to $15 per million (GPT-5 Pro).

The decision matrix is tight now. Three models compete directly: GPT-5 ($1.25/$10 per M tokens), GPT-4.1 ($2/$8), and o3 ($2/$8). GPT-5 is cheaper and faster. o3 is slower but better at reasoning. GPT-4.1 is the legacy default.

This guide prices every model in production as of March 21, 2026, and breaks down cost-per-task for real workloads.

GPT-5 Series Pricing

The GPT-5 family has five tiers, each optimized for different workloads.

GPT-5.4: High-Context, Balanced

Metric	Value
Context Window	272K tokens
Prompt Price	$2.50/M
Completion Price	$15/M
Throughput	45 tok/s
Max Output	128K

GPT-5.4 is OpenAI's premium model. 272K context (90,000 words of Shakespeare). Designed for complex reasoning with large documents or code repositories.

Use cases: code review on a full codebase, long document analysis, multi-page contract review. The high completion cost ($15/M) makes it uneconomical for high-volume tasks.

Cost per task: Analyzing a 100-page document (100K prompt tokens) + 2K output tokens: (100K × $2.50 + 2K × $15) / 1M = $0.28.

GPT-5.1: Extended Context, Baseline

Metric	Value
Context Window	400K tokens
Prompt Price	$1.25/M
Completion Price	$10/M
Throughput	47 tok/s
Max Output	128K

GPT-5.1 is the best-value reasoning model. 400K context (130,000 words). Same latency as GPT-5. Lower cost.

Use cases: long-document Q&A, multi-file code analysis, legal document review. Every team doing RAG at scale should test GPT-5.1.

Cost per task: 50K prompt + 1K completion: (50K × $1.25 + 1K × $10) / 1M = $0.073.

GPT-5 Codex: Extended Context, Code-Optimized

Metric	Value
Context Window	400K tokens
Prompt Price	$1.25/M
Completion Price	$10/M
Throughput	50 tok/s
Max Output	128K

GPT-5 Codex is GPT-5.1 fine-tuned for code. Same pricing, slightly higher code throughput (50 vs 47 tok/s).

Use cases: code generation, debugging, refactoring. If teams are sending code to GPT-5.1, use Codex instead. No cost difference, slightly better output quality.

GPT-5 Pro: Reasoning Upgrade

Metric	Value
Context Window	400K tokens
Prompt Price	$15/M
Completion Price	$120/M
Throughput	11 tok/s
Max Output	128K

GPT-5 Pro is expensive. $15 per million prompt tokens, $120 per million completions. Throughput is 11 tok/s (4x slower than GPT-5.1).

It exists for problems that require deep reasoning, where slow + smart beats fast + dumb. Math competition problems. Novel research. Logical puzzles.

Cost per task: 10K prompt + 500 completion: (10K × $15 + 500 × $120) / 1M = $0.21. Expensive per task, but the output quality justifies it for hard problems.

GPT-5: Balanced Default

Metric	Value
Context Window	272K tokens
Prompt Price	$1.25/M
Completion Price	$10/M
Throughput	41 tok/s
Max Output	128K

GPT-5 is the baseline. 272K context. Fair pricing. Industry standard. Default choice for most tasks.

This is the model to compare against. If another model doesn't beat GPT-5 on cost, latency, or quality, don't use it.

Cost per task: 5K prompt + 500 completion: (5K × $1.25 + 500 × $10) / 1M = $0.011.

GPT-5 Mini: Lightweight, Fast

Metric	Value
Context Window	272K tokens
Prompt Price	$0.25/M
Completion Price	$2/M
Throughput	68 tok/s
Max Output	128K

GPT-5 Mini costs 5x less than GPT-5. Throughput is 66% faster (68 vs 41 tok/s). Quality loss: ~10-15% (smaller model, trained on same data).

Ideal for: high-volume tasks, classification, content moderation, simple Q&A. If tasks are straightforward and volume matters, Mini wins.

Cost per task: 1K prompt + 200 completion: (1K × $0.25 + 200 × $2) / 1M = $0.0009.

GPT-5 Nano: Ultra-Budget

Metric	Value
Context Window	272K tokens
Prompt Price	$0.05/M
Completion Price	$0.40/M
Throughput	95 tok/s
Max Output	32K

GPT-5 Nano is the $0.05 tier. Extremely cheap. Extremely fast (95 tok/s). Quality is borderline (similar to GPT-3.5).

Use: classification, tagging, routing. Not suitable for content creation or reasoning. Output is often terse or low-quality.

Cost per task: 500 prompt + 100 completion: (500 × $0.05 + 100 × $0.40) / 1M = $0.00005.

GPT-4 Series Pricing

GPT-4.1 is the current standard. GPT-4o is cheaper but older. Both are legacy now that GPT-5 is available.

GPT-4.1: Extended Context, Industry Default

Metric	Value
Context Window	1.05M tokens
Prompt Price	$2/M
Completion Price	$8/M
Throughput	55 tok/s
Max Output	32K

GPT-4.1 has the largest context window: 1.05M tokens. That's 350,000 words. Full book analysis. Entire codebase + documentation.

But GPT-5 ($1.25/M prompt) is cheaper. And GPT-5.1 ($1.25/M with 400K) covers most long-context needs.

Use GPT-4.1 only if teams need the full 1M context and don't mind paying 60% more than GPT-5. Most teams should prefer GPT-5.

Cost per task: 200K prompt (full codebase) + 2K completion: (200K × $2 + 2K × $8) / 1M = $0.416.

GPT-4.1 Mini: Lightweight Extended Context

Metric	Value
Context Window	1.05M tokens
Prompt Price	$0.40/M
Completion Price	$1.60/M
Throughput	75 tok/s
Max Output	32K

Mini version of GPT-4.1. Same 1M context, lower cost, faster throughput.

Still more expensive than GPT-5 Mini ($0.25/$2). Only use for teams that genuinely need 1M context and don't mind the extra cost.

GPT-4.1 Nano: Ultra-Budget Extended Context

Metric	Value
Context Window	1.05M tokens
Prompt Price	$0.10/M
Completion Price	$0.40/M
Throughput	82 tok/s
Max Output	32K

The cheapest way to access 1M context. $0.10/M prompt, $0.40/M completion. Quality is lower than Mini.

GPT-4o: Legacy, Wide Context

Metric	Value
Context Window	128K tokens
Prompt Price	$2.50/M
Completion Price	$10/M
Throughput	52 tok/s
Max Output	16K

GPT-4o is the previous flagship. 128K context. Now superseded by GPT-5 series.

Don't use. GPT-5 ($1.25/$10) is half the prompt cost and has 2x the context window. GPT-5 Mini ($0.25/$2) is way cheaper.

GPT-4o Mini: Legacy Lightweight

Metric	Value
Context Window	128K tokens
Prompt Price	$0.15/M
Completion Price	$0.60/M
Throughput	75 tok/s
Max Output	16K

Don't use. GPT-5 Mini ($0.25/$2) is slightly more expensive but much better quality.

Reasoning Models (o3, o4)

Reasoning models trade throughput for correctness. Slow. Expensive. Worth it for hard problems.

o3: Advanced Reasoning

Metric	Value
Context Window	200K tokens
Prompt Price	$2/M
Completion Price	$8/M
Throughput	17 tok/s
Max Output	100K

o3 is OpenAI's reasoning-focused model. Uses chain-of-thought internally. Very slow (17 tok/s, 2.4x slower than GPT-5).

But for hard problems (math, logic, novel reasoning), o3 is better than GPT-5. Win rate on competition math: o3 60%, GPT-5 40%.

Cost per task: 5K prompt + 2K completion (lots of thinking): (5K × $2 + 2K × $8) / 1M = $0.026. Slow to run, but small per-task cost.

o3 Mini: Reasoning, Fast

Metric	Value
Context Window	200K tokens
Prompt Price	$1.10/M
Completion Price	$4.40/M
Throughput	47 tok/s
Max Output	100K

o3 Mini is o3 optimized for speed. Throughput: 47 tok/s (still slower than GPT-5 at 41 tok/s, but acceptable).

Pricing is better: $1.10/$4.40 vs o3's $2/$8. Quality loss: ~20-30%.

Use: high-volume reasoning tasks where speed matters. Filtering, routing. Not for novel problems.

o4 Mini: Latest Reasoning

Metric	Value
Context Window	200K tokens
Prompt Price	$1.10/M
Completion Price	$4.40/M
Throughput	62 tok/s
Max Output	100K

o4 Mini is the latest reasoning model. Same pricing as o3 Mini ($1.10/$4.40) but faster throughput (62 vs 47 tok/s).

o4 is still in limited release as of March 2026. Availability varies. Check current access before assuming availability.

OpenAI API Pricing 2026: Pricing Breakdown Table

Model	Context	Prompt $/M	Completion $/M	Throughput	Best For
GPT-5.4	272K	$2.50	$15	45	Premium reasoning
GPT-5.1	400K	$1.25	$10	47	Long documents
GPT-5 Codex	400K	$1.25	$10	50	Code tasks
GPT-5 Pro	400K	$15	$120	11	Hard reasoning
GPT-5	272K	$1.25	$10	41	Default choice
GPT-5 Mini	272K	$0.25	$2	68	High volume
GPT-5 Nano	272K	$0.05	$0.40	95	Classification
GPT-4.1	1.05M	$2	$8	55	Extra long context
GPT-4.1 Mini	1.05M	$0.40	$1.60	75	Long context, budget
GPT-4.1 Nano	1.05M	$0.10	$0.40	82	Budget long context
GPT-4o	128K	$2.50	$10	52	Legacy (avoid)
GPT-4o Mini	128K	$0.15	$0.60	75	Legacy (avoid)
o3	200K	$2	$8	17	Hard reasoning
o3 Mini	200K	$1.10	$4.40	47	Reasoning high-volume
o4 Mini	200K	$1.10	$4.40	62	Latest reasoning

Cost Per Task

Real-world pricing for common tasks (March 2026):

Classification Task (1K prompt, 50 completion)

Model	Cost	Time
GPT-5 Nano	$0.00006	0.5 sec
GPT-5 Mini	$0.0005	0.7 sec
GPT-5	$0.0013	1.2 sec
GPT-4.1 Mini	$0.00046	0.67 sec

Winner: GPT-5 Nano. 100x cheaper than GPT-5, 10x faster than GPT-5.

Customer Support Q&A (3K prompt, 500 completion)

Model	Cost	Time
GPT-5 Mini	$0.002	7.4 sec
GPT-5	$0.0054	12 sec
GPT-4.1 Mini	$0.002	6.7 sec

Winner: Tie between GPT-5 Mini and GPT-4.1 Mini. Mini wins on cost, speed is comparable.

Long Document Analysis (100K prompt, 2K completion)

Model	Cost	Time
GPT-5.1	$0.13	43 sec
GPT-5.4	$0.28	44 sec
GPT-4.1	$0.416	36 sec

Winner: GPT-5.1. Cheaper, nearly same speed, sufficient context.

Code Review (500K prompt, 5K completion)

Model	Cost	Time
GPT-4.1	$1.04	91 sec
GPT-5.1	$0.65	107 sec

Winner: GPT-5.1. 37% cheaper. Slightly slower but worth it.

Hard Math Problem (2K prompt, 8K completion, chain-of-thought)

Model	Cost	Time
o3	$0.082	471 sec
o3 Mini	$0.045	170 sec
GPT-5	$0.0125	195 sec

Winner: Depends on accuracy needed. o3 is best, o3 Mini balances cost and speed, GPT-5 is fastest and cheapest but least accurate.

Model Selection Guide

Decision Tree

High volume, simple tasks? Start with GPT-5 Nano ($0.05/M prompt). If quality is too low, upgrade to GPT-5 Mini ($0.25/M).

Standard tasks, balanced cost-quality? Use GPT-5 ($1.25/$10). This is the default unless a specific need pushes teams elsewhere.

Long documents (over 100K tokens)? Use GPT-5.1 (400K context, $1.25/M). Cheaper and better than GPT-4.1.

Extremely long documents (500K+ tokens)? Use GPT-4.1 (1.05M context, $2/M). Only option, but pricey.

Hard reasoning or novel problems? Use o3 ($2/$8 per M). Slow, but worth it for accuracy. If cost is tight, try o3 Mini ($1.10/$4.40) first.

Code tasks? Use GPT-5 Codex (same price as GPT-5.1 but optimized). Or just use GPT-5, it's good at code.

Avoid: GPT-4o, GPT-4o Mini, GPT-4.1 Nano. Superseded by GPT-5 series. No reason to use them.

Throughput Considerations

Throughput affects latency and real-world cost.

GPT-5 Nano (95 tok/s): 1,000-token completion in 10.5 seconds. Fast. GPT-5 (41 tok/s): 1,000-token completion in 24 seconds. Slower. o3 (17 tok/s): 1,000-token completion in 59 seconds. Very slow.

For user-facing applications, TTFT (time-to-first-token) matters as much as throughput. OpenAI doesn't publish TTFT, but it generally correlates with throughput. Faster models = lower TTFT.

If latency is critical: Use GPT-5 Mini (68 tok/s) or Nano (95 tok/s). If teams have time: Use o3 for hard problems.

FAQ

What's the cheapest model for customer-facing tasks?

GPT-5 Mini. $0.25/$2 per million tokens. 10x cheaper than GPT-5. Quality is 85-90% of GPT-5. Good for: Q&A, summarization, categorization.

Should I still use GPT-4.1?

Only if you need 1M context. Otherwise, use GPT-5 ($1.25/M prompt, half the cost, better quality). GPT-4.1 is legacy.

Is o3 worth the cost?

For competition math, novel research, logic puzzles: yes. For customer support or text generation: no.

What's the throughput difference between o3 and GPT-5?

o3: 17 tok/s. GPT-5: 41 tok/s. o3 is 2.4x slower but better at reasoning. For routine tasks, GPT-5 is fine.

Can I use GPT-5 Nano for everything?

No. Nano is low-quality (similar to GPT-3.5). Good for classification, tagging, routing. Not for content creation, code generation, or detailed analysis.

Which model should I use as my default?

GPT-5 ($1.25/$10). Best balance of cost, quality, and speed. If tasks are simple and volume is high, upgrade decision tree. If tasks are hard, consider o3.

Is GPT-5 better than Claude?

Comparable. GPT-5 is faster. Claude Sonnet 4.6 is $3/$15 per M tokens (more expensive). Each has different strengths: GPT-5 for code, Claude for nuance. Test both.

Throughput and Latency Implications

Pricing per token is one lens. Throughput per dollar is another.

Cost Per Task (Practical Examples)

Email classification (subject line, mark as spam/not spam):

Prompt: 200 tokens (email + instructions)
Completion: 5 tokens (spam/not-spam decision)
Model: GPT-5 Nano
Cost: (200 × $0.05 + 5 × $0.40) / 1M = $0.000011
Speed: 95 tok/s, completes in ~2 seconds
Monthly cost for 1M emails: $11

Blog post summarization (1,500 word article → 200 word summary):

Prompt: 3,500 tokens (article + summary instruction)
Completion: 500 tokens (summary)
Model: GPT-5 Mini
Cost: (3,500 × $0.25 + 500 × $2) / 1M = $0.001513
Speed: 68 tok/s completion, ~7 seconds total
Monthly cost for 1,000 summaries: $1.51

Detailed code review (entire file + guidelines):

Prompt: 8,000 tokens (code + review rubric)
Completion: 2,000 tokens (detailed review feedback)
Model: GPT-5
Cost: (8,000 × $1.25 + 2,000 × $10) / 1M = $0.030
Speed: 41 tok/s, ~50 seconds total
Monthly cost for 100 reviews: $3.00

Novel research problem-solving (new algorithm from scratch):

Prompt: 5,000 tokens (problem description, constraints, examples)
Completion: 5,000 tokens (novel algorithm with explanation)
Model: o3 (reasoning model)
Cost: (5,000 × $2 + 5,000 × $8) / 1M = $0.05
Speed: 17 tok/s, ~6 minutes (slow but accurate)
Cost per problem: $0.05 (expensive but worth it if solution is correct first try)

API Rate Limits by Model

OpenAI enforces rate limits (requests per minute, tokens per minute) based on pricing tier.

Model	Requests/min	Tokens/min	Notes
GPT-5 Nano	3,500	2M	Free tier
GPT-5 Mini	3,500	1M	Basic tier
GPT-5	3,500	500K	Standard tier
GPT-4.1	1,500	300K	Legacy
o3	100	100K	Reasoning limited

o3 has aggressive rate limits due to cost. Can't burst 100M tokens/hour on o3.

For high-volume tasks (1B+ tokens/day), you need:

Multiple API keys (different rate limit buckets)
Queue + batch processing (batch API has 1.5x cost savings but processes asynchronously)
Fallback to GPT-5 Mini/Nano when o3 hits limit

Batch API: 50% Cheaper

OpenAI offers a batch API: submit 10,000+ requests at once, receive results in 1-24 hours.

Cost reduction: 50% for all models. So GPT-5 Nano becomes $0.025/M prompt, $0.20/M completion.

Trade: latency. Instead of 5-second response, wait 1-24 hours.

Viable for:

Non-urgent tasks (data labeling, content generation, analysis)
Overnight processing
Research batches

Not viable for:

Customer-facing APIs (users expect immediate response)
Interactive tools

For 1B daily tokens via batch API: cost drops from ~$2 to ~$1. Annual savings: $360K for large teams.

Hybrid Approach: Multi-Model Strategy

Smart teams don't pick one model. They use different models for different tasks:

Task	Model	Reasoning
Classification	GPT-5 Nano	Cheapest, classification is simple
Summarization	GPT-5 Mini	Balance of cost and quality
Content creation	GPT-5	Best quality for text
Code generation	GPT-5 Codex	Optimized for code
Long documents	GPT-5.1	400K context, reasonable cost
Hard reasoning	o3	Best accuracy for novel problems

Example: Customer support AI.

Route incoming support tickets: GPT-5 Nano (classify priority, department)
Generate first-pass response: GPT-5 Mini (fast, good enough)
Hand-off to human if complexity flagged: Check with GPT-5 (full analysis)

Cost per ticket: mostly Nano + Mini (cheap), rarely GPT-5 (expensive).

Cost Trends: Will GPT-5 Pricing Drop?

Historical pattern: New models expensive, drop 50-70% in 12 months.

GPT-4o launch price: $2.50/$10 per M tokens GPT-4o price today (March 2026): Legacy tier, avoid

GPT-4.1 launch price (2024): $2/$8 per M tokens GPT-4.1 price today: Still $2/$8 (no reduction yet)

GPT-5 launch price (Feb 2026): $1.25/$10 per M tokens GPT-5 price today (March 2026): Still $1.25/$10

Prediction: GPT-5 pricing will drop to $0.75/$6 by Q4 2026. o3 will drop to $1/$4 by Q2 2026.

If you're in early development, use GPT-5 Nano/Mini to lock in the habit. When prices drop, your cost scales down further.

If you're already at production scale, locking in volume discounts (not public-facing, available via sales team) now is smart. Lock at $1.25/$10, save when public pricing drops.

Common Pricing Mistakes

Mistake 1: Using GPT-5.4 for Everything

GPT-5.4 is $2.50/$15 per M tokens. 2x the prompt cost of GPT-5, 1.5x the completion cost.

Best use: Complex reasoning with large documents.

Wrong use: Email replies (GPT-5 Mini sufficient). Blog posts (GPT-5 fine). Data entry (GPT-5 Nano overkill).

Monthly cost difference: 1M tokens on GPT-5.4 vs GPT-5 Mini = $2.50 + $15 vs $0.25 + $2 = $14.25 extra per million tokens.

Mistake 2: Not Using Batch API

Batch API is 50% cheaper. If 50% of your workload is non-urgent, using batch saves money.

Example: Labeling 10M documents for training data.

Non-batched (immediate): 10M tokens × $1.25 (GPT-5) = $12.50
Batched (overnight): 10M tokens × $1.25 × 0.5 = $6.25
Savings: $6.25 per 10M tokens

Mistake 3: Retrying Failed Requests Without Caching

If a request fails and you retry, you're charged twice.

Use caching or idempotency to avoid double-charges.

Mistake 4: Choosing by Price Alone

GPT-5 Nano is cheap, but quality is low. If you use Nano for complex tasks and get wrong answers, you waste time fixing them.

Time cost of manual review: $100/hr. Nano cost: $0.0001 per token.

If fixing a Nano mistake costs 30 minutes = $50, and choosing GPT-5 ($0.01 cost) gets it right first time, the ROI is clear.

Choose model based on task complexity, not just price.

Sources

OpenAI API Pricing
OpenAI Models Documentation
OpenAI API Benchmarks
DeployBase LLM Pricing Dashboard (prices observed March 21, 2026)

Contents