Contents
- AWS vs Azure GPU Pricing: Overview
- AWS GPU Instances & Pricing
- Azure GPU Instances & Pricing
- GCP GPU Instances & Pricing
- On-Demand Pricing Comparison
- Spot & Preemptible Pricing
- Reserved Instance Discounts
- GPU Architecture Differences
- Real-World Cost Scenarios
- Multi-Cloud Strategy
- FAQ
- Related Resources
- Sources
AWS vs Azure GPU Pricing: Overview
AWS, Azure, and GCP dominate production GPU procurement, but their pricing strategies diverge significantly. AWS leads on H100 availability and spot pricing depth. Azure competes on reserved capacity for large commitments. GCP offers unique preemptible pricing for fault-tolerant workloads. Understanding GPU pricing across hyperscalers helps teams optimize cloud spend and avoid vendor lock-in as of March 2026.
The biggest cost factor is commitment level: on-demand vs. spot vs. reserved.
AWS GPU Instances & Pricing
Instance Families
AWS organizes GPUs into purpose-specific families:
P5 Family (Latest, Hopper + Grace)
- p5.48xlarge: 8x H100 SXM GPUs
- Status: Limited availability (2026)
- Focus: LLM training, large-scale ML
P4d Family (A100, Ampere)
- p4d.24xlarge: 8x A100 SXM GPUs (80GB)
- Wide availability across regions
- Primary instance for large-scale A100 training
G4 Family (T4 GPUs)
- g4dn.xlarge: 1x T4
- g4dn.2xlarge: 1x T4
- Cost-efficient for inference and small training
G5 Family (A10G GPUs)
- g5.xlarge: 1x A10G
- g5.2xlarge: 1x A10G
- Inference-optimized, 24GB VRAM per GPU
On-Demand Pricing
As of March 2026, AWS on-demand pricing (US East, N. Virginia):
| Instance | GPUs | GPU Type | Hourly Rate |
|---|---|---|---|
| p5.48xlarge | 8 | H100 SXM | $98.32 |
| p4d.24xlarge | 8 | A100 SXM | $21.96 |
| g5.2xlarge | 1 | A10G | $1.01 |
| g4dn.xlarge | 1 | T4 | $0.35 |
Per-GPU cost for p5.48xlarge (on-demand): $98.32 ÷ 8 = $12.29/hour per H100 SXM
1-year reserved pricing for p5.48xlarge: $55.04/hour ($6.88/GPU)
Per-GPU cost for p4d.24xlarge: $21.96 ÷ 8 = $2.745/hour per A100 SXM
AWS H100 on p5 instances is 2-3x more expensive per GPU than specialty providers (RunPod at $2.69/hour for H100 SXM) but includes 400Gbps EFA networking, 1.1TB RAM, and deep AWS managed services integration (IAM, VPC, CloudWatch, SageMaker, etc.).
Spot Pricing
AWS spot prices fluctuate based on demand. March 2026 average spot prices (US East):
| Instance | Spot Rate | Discount |
|---|---|---|
| p5.48xlarge | $29.50 | ~70% off on-demand |
| p4d.24xlarge | $6.59 | 70% off on-demand |
| g5.2xlarge | $0.30 | 70% off on-demand |
Spot instances can be interrupted with 2-minute notice. Interruption frequency is ~1-5% annually depending on instance type.
Reserved Instances
AWS reserved instances (RIs) lock in rates for 1 or 3 years.
1-year reserved (p4d.24xlarge, 8x A100):
- Upfront: ~$96,000
- Effective hourly: $10.98 (50% discount)
3-year reserved (p4d.24xlarge, 8x A100):
- Upfront: ~$150,000
- Effective hourly: $5.93 (73% discount)
Monthly cost at 3-year rate: $5.93 × 730 hours = $4,329/month for 8x A100s.
Azure GPU Instances & Pricing
Instance Families
Azure organizes GPUs by workload:
ND A100 v4 Series (A100 training)
- Standard_ND96asr_v4: 8x A100 SXM (80GB)
- Focus: Distributed training, HPC
NC A100 v4 Series (A100, general ML)
- Standard_NC24ads_A100_v4: 1x A100 80GB
- Standard_NC96ads_A100_v4: 4x A100 80GB
NC H100 v5 Series (H100 training)
- Standard_NC40ads_H100_v5: 1x H100 80GB
- Standard_NC80adis_H100_v5: 2x H100 80GB
NV Family (T4, graphics/inference)
- Standard_NV6: 1x T4
- Standard_NV12: 2x T4
On-Demand Pricing
As of March 2026, Azure on-demand pricing (US East):
| Instance | GPUs | GPU Type | Hourly Rate |
|---|---|---|---|
| Standard_ND96asr_v4 | 8 | A100 SXM 80GB | $28.50 |
| Standard_NC96ads_A100_v4 | 4 | A100 80GB | $14.70 |
| Standard_NC80adis_H100_v5 | 2 | H100 80GB | $18.00 |
| Standard_NV6 | 1 | T4 | $0.90 |
Per-GPU cost for ND96asr_v4: $28.50 ÷ 8 = $3.56/hour per A100 SXM
Azure is slightly cheaper than AWS on A100 but comparable on overall pricing.
Reserved Instances (Savings Plans)
Azure uses "Savings Plans" instead of traditional RIs. Commitment terms:
1-year savings plan (ND96asr_v4, 8x A100):
- Rate: $15.21/hour for instance
- Discount: 47% off on-demand
3-year savings plan:
- Rate: $7.89/hour for instance
- Discount: 72% off on-demand
Monthly cost at 3-year rate: $7.89 × 730 hours = $5,759/month for 8x A100s.
Spot Pricing
Azure "spot instances" pricing is similar to AWS:
| Instance | Spot Rate | Discount |
|---|---|---|
| Standard_ND96asr_v4 | $8.55 | 70% off on-demand |
| Standard_NC80adis_H100_v5 | $5.40 | 70% off on-demand |
Spot termination risk is comparable to AWS (1-5% annually).
GCP GPU Instances & Pricing
Instance Families
GCP organizes GPUs into machine types:
A2 VMs (High Memory)
- a2-highgpu-8g: 8x A100 SXM GPUs
- a2-highgpu-16g: 16x A100 SXM GPUs
A3 VMs (Latest, Hopper)
- a3-megagpu-8g: 8x H100 GPUs
- a3-megagpu-16g: 16x H100 GPUs
N1/N2 VMs (General-purpose with T4)
- n1-standard-4: Up to 4x T4 GPUs
On-Demand Pricing
As of March 2026, GCP on-demand pricing (us-central1):
| Instance | GPUs | GPU Type | Hourly Rate |
|---|---|---|---|
| a3-highgpu-8g | 8 | H100 | $88.49 |
| a2-highgpu-8g | 8 | A100 | $35.20 |
| a2-highgpu-1g | 1 | A100 | $3.67 |
| n1-standard-4 (1x T4) | 1 | T4 | $0.35 |
Per-GPU cost for a3-highgpu-8g: $88.49 ÷ 8 = $11.06/hour per H100
GCP H100 on-demand pricing ($11.06/GPU, a3-highgpu-8g) is comparable to Azure ($11.06/GPU) and significantly cheaper than AWS ($12.29/GPU).
Preemptible VMs
GCP preemptible pricing (A3 H100 instances not eligible):
| Instance | Preemptible Rate | Discount |
|---|---|---|
| a2-highgpu-8g | $10.56 | ~70% off on-demand |
| a2-highgpu-1g | $1.10 | ~70% off on-demand |
Note: A3 (H100) instances do not support preemptible pricing on GCP.
Preemptible instances terminate every 24 hours automatically + can be interrupted at any time. The 90% discount makes them attractive for fault-tolerant batch workloads.
Committed Use Discounts (CUDs)
GCP offers 1 and 3-year commitment discounts:
1-year CUD (a3-megagpu-8g):
- Rate: ~$59.29/hour
- Discount: ~33% off on-demand
3-year CUD (a3-megagpu-8g):
- Rate: ~$34.51/hour
- Discount: ~61% off on-demand
Monthly cost at 3-year rate: $34.51 × 730 hours = $25,192/month for 8x H100s.
GCP's committed discounts bring H100 costs down significantly, but AWS 1-year reserved ($55.04/hr) and CoreWeave on-demand ($49.24/hr) remain more cost-effective for most teams.
On-Demand Pricing Comparison
A100 SXM (Data-parallel training, mid-scale)
| Provider | Instance | Per-GPU Cost | Full Node Cost |
|---|---|---|---|
| AWS | p4d.24xlarge (8x A100 SXM) | $2.75 | $21.96 |
| Azure | ND96asr_v4 (8x A100 SXM) | $3.56 | $28.50 |
| GCP | a2-highgpu-8g (8x A100) | $4.40 | $35.20 |
| CoreWeave | 8x A100 SXM | $2.70 | $21.60 |
Winner for A100: CoreWeave at $2.70/GPU ($21.60 total) narrowly beats AWS p4d ($2.75/GPU, $21.96 total). Azure and GCP are significantly more expensive per GPU.
H100 SXM (LLM training, large-scale)
| Provider | Instance | Per-GPU Cost | Full Node Cost |
|---|---|---|---|
| AWS | p5.48xlarge (8x H100 SXM) | $12.29 | $98.32 |
| Azure | ND H100 v5 (8x H100) | $11.06 | $88.49 |
| GCP | a3-highgpu-8g (8x H100) | $11.06 | $88.49 |
| CoreWeave | 8x H100 SXM | $6.16 | $49.24 |
Winner for H100 on-demand: CoreWeave is the cheapest at $6.16/GPU ($49.24 for 8x). Among hyperscalers, AWS and GCP are comparable (~$11/GPU), both cheaper than AWS on-demand ($12.29/GPU).
T4 GPU (Inference, development)
| Provider | Instance | Per-GPU Cost | Full Node Cost |
|---|---|---|---|
| AWS | g4dn.xlarge (T4) | $0.35 | $0.35 |
| Azure | NV6 (T4) | $0.90 | $0.90 |
| GCP | n1-standard-4 (T4) | $0.35 | $0.35 |
Winner for T4: AWS and GCP tied at $0.35. Azure is 2.5x more expensive.
Spot & Preemptible Pricing
Spot Pricing Comparison (8x H100 nodes)
| Provider | Node Type | On-Demand | Spot | Discount | |---|---|---|---| | AWS | p5.48xlarge (8x H100) | $98.32 | $29.50 | ~70% | | Azure | ND H100 v5 (8x H100) | $88.49 | $26.55 | ~70% | | GCP | a3-highgpu-8g (8x H100) | $88.49 | N/A | Not available |
Winner for deep discounts on H100: AWS and Azure spot pricing. GCP does not offer preemptible pricing on A3 H100 instances. For A100 workloads, GCP preemptible is available at ~70% off.
Risk & Interruption Rates
AWS spot: 1-5% interruption annually, varies by instance type and region. p4d instances are rare, so interruption risk is lower.
GCP preemptible: 100% interruption at 24-hour mark guaranteed. Additional interruptions possible but less frequent than AWS spot.
Azure spot: Similar to AWS, 1-5% annually.
For workloads that can handle interruptions (batch training with checkpointing), GCP preemptible offers 90% cost savings vs. on-demand.
Reserved Instance Discounts
3-Year Commitment Pricing
All three hyperscalers offer 70-73% discounts for 3-year commitments on GPU instances.
| Provider | Instance | On-Demand | 3-Year Rate | Monthly Cost | |---|---|---|---| | AWS | p5.48xlarge (8x H100 SXM) | $98.32 | ~$39.33 | $28,711 | | AWS | p4d.24xlarge (8x A100 SXM) | $21.96 | $6.59 | $4,810 | | Azure | ND96asr_v4 (8x A100 SXM) | $28.50 | $7.89 | $5,759 | | GCP | a3-highgpu-8g (8x H100) | $88.49 | ~$34.51 | $25,192 |
Winner for committed H100 use: GCP 3-year CUD at ~$34.51/hr ($25,192/month). AWS 1-year reserved at $55.04/hr ($40,179/month). CoreWeave on-demand at $49.24/hr ($35,945/month) is competitive without a long-term commitment.
Key insight: For H100 with long-term commitment, GCP 3-year CUD is cheapest among hyperscalers. For A100, Azure ND at $7.89/hr 3-year reserved is cheapest.
GPU Architecture Differences
NVIDIA Hopper (H100) vs Ampere (A100)
H100 advantages:
- 2x FP8 tensor throughput vs A100
- Transformer-optimized architecture
- 80GB HBM3 memory (same capacity as A100 80GB, but 3.35 TB/s bandwidth vs 2.0 TB/s)
- Better for inference and multi-billion parameter models
A100 advantages:
- Older, more stable drivers and software
- Lower per-GPU cost (GCP A100 = $4.40 vs. H100 = $11.06 on GCP)
- Sufficient for most models under 100B parameters
- Better supported in older ML frameworks
Recommendation: Use A100 if training models under 100B params. Use H100 for foundation models or inference at scale.
Memory Bandwidth
| GPU | Memory | Bandwidth | Use Case |
|---|---|---|---|
| T4 | 16GB GDDR6 | 320 GB/s | Inference, small training |
| A10G | 24GB GDDR6 | 600 GB/s | Inference, fine-tuning |
| A100 | 80GB HBM2e | 2.0 TB/s | General training |
| H100 SXM | 80GB HBM3 | 3.35 TB/s | LLM training, large batches |
T4 is suitable only for inference and models fitting in 16GB. A100 and H100 are for serious training workloads.
Interconnect Quality
AWS p4d: 300 GB/s inter-GPU bandwidth (dual Nvidia NVLink). Low all-reduce latency.
Azure ND: 300 GB/s inter-GPU bandwidth (similar to AWS). Comparable latency.
GCP a3: 400 GB/s inter-GPU bandwidth (newest NVLink). Best latency for distributed training.
For distributed training with 8+ GPUs, GCP's interconnect is marginally faster, but differences are small (1-5% wall-clock improvement in most cases).
Real-World Cost Scenarios
Scenario 1: One-Week LLM Fine-Tuning Project
Workload: Fine-tune LLAMA 2 70B model on company data for 100 hours of H100 compute.
AWS on-demand (p5.48xlarge, 8x H100):
- 100 hours ÷ 8 GPUs = 12.5 hours of node time
- Cost: 12.5 × $98.32 = $1,229
GCP on-demand (a3-highgpu-8g, 8x H100):
- 100 hours ÷ 8 GPUs = 12.5 hours
- Cost: 12.5 × $88.49 = $1,106
Azure on-demand (ND H100 v5, 8x H100):
- 12.5 × $88.49 = $1,106
Verdict: GCP and Azure are similarly priced ($1,106) for H100 fine-tuning on-demand; both are cheaper than AWS ($1,229). GCP does not offer preemptible on A3 H100 instances. CoreWeave ($49.24/hr) would cost $617 for the same workload.
Scenario 2: 3-Month LLM Training Project
Workload: Train custom LLM from scratch, 10,000 H100 hours over 3 months.
AWS on-demand (p5.48xlarge, H100):
- 10,000 ÷ 8 = 1,250 node-hours
- Cost: 1,250 × $98.32 = $122,900
AWS 1-year reserved (commitment):
- 1-year rate: $55.04/hr
- Cost: 1,250 × $55.04 = $68,800
GCP on-demand (a3-highgpu-8g, H100):
- 10,000 ÷ 8 = 1,250 node-hours
- Cost: 1,250 × $88.49 = $110,613
GCP 1-year CUD (a3-highgpu-8g):
- 1-year rate: ~$59.29/hr (33% off)
- Cost: 1,250 × $59.29 = $74,113
Verdict: AWS 1-year reserved ($68,800) is the most cost-effective hyperscaler option. GCP on-demand ($110,613) and Azure on-demand ($110,613) are similar. CoreWeave ($49.24/hr × 1,250 = $61,550) is the cheapest option for this scale.
Scenario 3: Production Inference Service (Ongoing)
Workload: Serve 1M inference requests/day on 70B param model. Requires ~5 A100 GPUs continuously.
AWS on-demand (g5.2xlarge, A10G):
- 5 × $1.01 (A10G on-demand) = $5.05/hour
- Monthly: $5.05 × 730 = $3,686.50
- Annual: $44,238
GCP on-demand (a2-highgpu-1g, 1 A100 = $3.67/hour, need 5):
- 5 × $3.67 = $18.35/hour
- Monthly: $18.35 × 730 = $13,396
- Annual: $160,748
GCP 3-year CUD (committed):
- 5 × $2.20/hour (3-year A100 rate) = $11.00/hour
- Monthly: $11.00 × 730 = $8,030
- Annual: $96,360
Verdict: AWS on-demand (A10G) is cheapest at $44,238/year for this inference use case. GCP 3-year commitment ($96,360) is more expensive than AWS A10G for this scenario but uses a more powerful A100 GPU. AWS spot on A10G instances would be even cheaper (~$13,000/year) but risks interruptions for production.
Multi-Cloud Strategy
Why Use Multiple Clouds?
1. Cost arbitrage: Spot pricing fluctuates. Spreading workloads across AWS and GCP reduces risk of all capacity being expensive.
2. Availability: If one cloud is capacity-constrained, the other may have cheaper rates. Especially important for H100 during high-demand periods.
3. Regional proximity: Teams in Europe may prefer Azure (strong European data center presence). APAC teams may prefer GCP.
4. Feature diversity: AWS offers managed services integration. GCP offers the deepest preemptible discounts. Azure offers best reserved rates.
Multi-Cloud Workload Distribution Example
Model training with uncertain timeline:
- 30% workload on AWS spot (for experimentation, prone to interruptions)
- 40% workload on GCP preemptible (deep discount, but requires checkpointing)
- 30% workload on AWS 3-year reserved (stable, predictable cost)
This mix reduces cost volatility and guarantees core capacity (reserved) while capturing discounts on flexible portions.
FAQ
Q: Which cloud has the cheapest H100? Among hyperscalers, GCP and Azure are comparable at $11.06/GPU ($88.49 for 8×H100), with AWS on-demand at $12.29/GPU ($98.32 for 8×H100). AWS 1-year reserved drops to $6.88/GPU ($55.04/hr). For specialty providers, CoreWeave at $6.16/GPU ($49.24/hr for 8x) is cheaper than hyperscalers. RunPod H100 SXM at $2.69/hr is the cheapest single-GPU option.
Q: Can I use spot pricing for production inference? Not recommended. Interruption rate is 1-5% annually, which translates to ~4-50 hours downtime per year. Most production systems can't tolerate this.
Q: What's the breakeven for reserved instances vs. spot? If spot interruption cost exceeds 30% of workload (due to restart overhead), reserved instances become cheaper. For most production workloads, reserved instances are justified.
Q: Does GCP preemptible work with distributed training? Yes, with multi-GPU checkpointing. Every 12-24 hours, preemptible instances terminate. Code must save model checkpoints to object storage and resume from the latest. Google Batch service handles this automatically.
Q: Can I move workloads between AWS, Azure, and GCP? Yes, if workloads are containerized (Docker). All three support Kubernetes. Framework code (PyTorch, JAX, TensorFlow) is portable.
Q: Which cloud is best for multi-year commitments on H100? GCP's 3-year CUD (~$34.51/hr for 8x H100) is the cheapest hyperscaler commitment option, offering ~61% off on-demand. AWS 1-year reserved ($55.04/hr) is a strong alternative with shorter commitment. For A100 workloads, Azure ND A100 v4 3-year reserved is the most cost-effective hyperscaler option.
Q: Are there hidden costs (data egress, storage)? Data egress from GPU instances costs $0.01-0.12 per GB depending on destination. Storage (EBS, persistent disks) costs $0.10-0.30/GB/month. These add 10-20% overhead for intensive ML workloads. Budget accordingly.
Related Resources
- AWS GPU Pricing Guide
- Azure GPU Pricing Guide
- GCP GPU Pricing Guide
- CoreWeave GPU Pricing Alternative
- GPU Pricing Comparison
- Spot vs On-Demand GPU Pricing
Sources
- AWS EC2 Pricing. ec2instances.info/ (March 2026)
- AWS EC2 Reserved Instances. aws.amazon.com/ec2/pricing/reserved-instances/ (March 2026)
- Azure Virtual Machines Pricing. azure.microsoft.com/en-us/pricing/details/virtual-machines/windows/ (March 2026)
- Google Cloud Compute Engine Pricing. cloud.google.com/compute/all-pricing (March 2026)
- Google Cloud Committed Use Discounts. cloud.google.com/docs/cuds (March 2026)
- AWS EC2 Spot Instances History. aws.amazon.com/ec2/spot/pricing-history/ (March 2026)