GPU Cloud Cost Calculator: Compare Hourly Rates Across Providers

GPU Cloud Cost Calculator
Pricing Methodology
Provider Rate Comparison
Calculation Examples
Hidden Costs & Adjustments
Optimization Tips
FAQ
Related Resources
Sources

GPU Cloud Cost Calculator

GPU cloud cost calculator: Hourly rates are only part of the cost.

Add storage, egress, networking, service tiers. Things add up fast.

Scenarios differ: cheap experimentation, moderate projects, serious production, batch processing.

Each needs different math.

Pricing Methodology

Hourly GPU rates form the foundation. For a RunPod H100 SXM running 24/7 for one month:

Hours: 24 hours/day x 30 days = 720 hours RunPod H100 SXM rate: $2.69/hour Monthly compute cost: 720 x $2.69 = $1,937

Storage adds a monthly component:

Storage size: 100GB RunPod storage rate: $0.01/GB/month Monthly storage: 100 x $0.01 = $1

Data transfer multiplies with usage:

Outbound traffic: 500GB monthly Egress rate: $0.12/GB Monthly egress: 500 x $0.12 = $60

Total monthly cost: $1,937 + $1 + $60 = $1,998

This structure reveals storage and egress matter most for sustained operations. High-volume data transfer jobs shift cost composition dramatically.

Discount calculation for commitments:

On-demand H100 SXM (RunPod): $2.69/hour 1-year reserved: ~28% discount = ~$1.94/hour 3-year reserved: ~40% discount = ~$1.61/hour

For a 12-month deployment:

On-demand: $1,937 x 12 = $23,244
1-year reserved: ~$1,397 x 12 = $16,764 (28% savings)
3-year reserved: ~$1,159 x 12 = $13,910 (40% savings)

Reserved instances require upfront commitment and reduce flexibility.

Provider Rate Comparison

Current market rates as of March 2026:

H100 Pricing by Provider:

Provider	Rate	Region	Variant
AWS	$6.88/hr	US-East	1x H100 (p5 node)
AWS	$55.04/hr	US-East	8x H100 (p5.48xlarge)
Google Cloud	$88.49/hr	US-Central	8x H100 SXM (a3-highgpu-8g)
Azure	$88.49/hr	US-East	8x H100 (ND H100 v5)
Lambda	$2.86/hr	US	PCIe
Lambda	$3.78/hr	US	SXM
RunPod	$1.99/hr	Global	PCIe
RunPod	$2.69/hr	Global	SXM
CoreWeave	$49.24/hr (8x)	Global	8-pack SXM
Vast.AI	$1.47-2.00/hr	Market	Variable

RunPod offers the lowest H100 rates, starting at $1.99/hr for PCIe. Large cluster deployments (8+ GPUs) benefit from CoreWeave's bundle pricing.

L40S Pricing:

Provider	Rate	Setup
RunPod	$0.79/hr	Instant
Lambda	$0.92/hr	2-5 min
Paperspace	$1.20-1.80/hr	5 min
AWS	$1.50+/hr	10 min
Google Cloud	$1.30/hr	8 min
Vast.AI	$0.60-1.00/hr	Variable

T4 Pricing (Budget-Conscious):

Provider	Rate	Notes
RunPod	$0.05/hr	Special pricing
AWS	$0.35/hr	Spot discount
Google Cloud	$0.35/hr	Preemptible
Azure	$0.25-0.35/hr	Variable

Specialized providers (RunPod, CoreWeave, Vast.AI) typically undercut hyperscalers on GPU-only compute. Hyperscalers excel on bundled services and compliance features.

Calculation Examples

Example 1: Short-term fine-tuning (1 week)

Workload: Fine-tune Llama 2 7B on proprietary dataset

Specifications:

GPU: 1x A100 (80GB)
Duration: 7 days continuous
Storage: 200GB (training data + output)
Egress: 10GB (checkpoint downloads)

Provider options:

RunPod A100:

Compute: 168 hours x $1.19/hr = $199.92
Storage: 200GB x $0.01/GB/month = $1.67 (prorated)
Egress: Free
Total: ~$201.60

AWS A100:

Compute: 168 hours x $1.85/hr = $310.80
EBS Storage: 200GB x $0.12/month = $1.20 (prorated)
Egress: 10GB x $0.09 = $0.90
Total: ~$312.90

RunPod saves ~$111 (35%) on this workload.

Example 2: Production inference (30 days)

Workload: Serve Llama 2 13B to 100 concurrent users

Specifications:

GPU: 2x H100 SXM
Duration: 30 days, 24/7
Storage: 500GB (models + cache)
Egress: 2TB (user responses)

RunPod H100 SXM:

Compute: 1,440 hours x $2.69/hr = $3,873.60
Storage: 500GB x $0.01 = $5
Egress: Free
Total: $3,878.60 (2x GPUs)

AWS H100 SXM (per GPU via p5 node, shared 8-GPU minimum):

Compute: 1,440 hours x $6.88/hr = $9,907 (2x GPUs)
EBS: 500GB x $0.12 = $60 (2x)
Egress: 2,000GB x $0.09 = $180
Total: ~$10,147

RunPod saves ~$6,268 (62%) for this scenario due to hyperscaler premium.

Example 3: Batch processing (bulk workload)

Process 10TB dataset on 8x L40S cluster for 5 days

Specifications:

GPU: 8x L40S
Duration: 5 days (120 hours)
Storage: 10TB input + 15TB output
Egress: 15TB final results

CoreWeave 8xL40S:

Compute: 120 hours x $18 = $2,160
Storage: 25TB x $0.05/month = ~$42 (prorated)
Egress: 15TB x $0.12 = $1,800
Total: ~$4,002

RunPod 8x L40S (separate):

Compute: 120 hours x 8 x $0.79 = $761.
Storage: 25TB x $0.01 = ~$8.33
Egress: Free
Total: ~$770

Significant use at scale. Infrastructure optimization (shared cluster storage) compounds savings.

Hidden Costs & Adjustments

Beyond obvious GPU rates, costs hide in fine print:

Setup and initialization: Some platforms charge $10-50 to set up capacity. Included in first invoice.

Minimum commitment blocks: Certain services (production AWS) require $5,000 minimum monthly spend.

Currency fluctuations: International billing incurs exchange rates. $100 invoice becomes $108 in unfavorable rates.

Tax and VAT: Regional taxes add 5-25% to final invoice depending on jurisdiction.

Networking: Bandwidth between regions multiplies costs. Direct Connect (AWS) costs $0.30-0.50/hour on top of compute.

Snapshot storage: Retaining VM snapshots costs $0.05/GB/month indefinitely.

Support costs: Premium support tiers add $1000-3000+ monthly on hyperscale deployments.

Optimization Tips

1. Right-size GPU selection: Confirm GPU capacity matches workload requirements. Oversizing by one tier often adds 50%+ cost without proportional benefit.

2. Batch processing windows: Group large data transfers into single sessions. Overnight batch operations reduce bandwidth spikes.

3. Regional deployment: Position workloads in low-cost regions when latency permits. Singapore costs 40% less than US-East on some providers.

4. Reserve strategically: Commit to 1-year reservations for predictable base loads. Keep 20% capacity on-demand for variable workloads.

5. Monitor egress religiously: Track data movement weekly. 1GB unnecessary transfer seems trivial but scales to $1000/month quickly.

6. Use spot instances for fault-tolerant work: Batch processing tolerates interruptions. Spot instances cost 50-70% less.

7. Implement caching: Cache inference outputs. Avoid reprocessing identical inputs.

8. Choose provider carefully per phase: Development on RunPod, production on AWS/GCP if required for compliance.

FAQ

How accurate are cost estimates before deployment? Within 10-20% for straightforward compute. Egress and storage vary more substantially based on actual usage patterns.

Can I lock in prices for a year? Yes, through reserved instances. AWS and Google Cloud offer 1-3 year commitments. Providers may adjust reserved rates quarterly.

What happens if I exceed my budget? Hyperscalers charge overage at standard rates. RunPod may pause instances. Check provider policies to avoid surprises.

Are there any free tier offers? AWS, Google Cloud, and Azure offer limited free credits. Most GPU operations exceed free tier limits within days.

Should I use spot instances for production? No. Spot instances (AWS) or preemptible instances (Google) interrupt without warning. Fault-tolerant architectures can tolerate interruptions.

Sources

AWS EC2 Instance Pricing Page
Google Cloud Compute Pricing Documentation
Microsoft Azure Virtual Machines Pricing
RunPod Official Pricing
Lambda Labs Pricing Page

Contents