AWS vs Azure vs GCP: GPU Cloud Pricing War

AWS vs Azure GPU Pricing: Overview
AWS GPU Instances & Pricing
Azure GPU Instances & Pricing
GCP GPU Instances & Pricing
On-Demand Pricing Comparison
Spot & Preemptible Pricing
Reserved Instance Discounts
GPU Architecture Differences
Real-World Cost Scenarios
Multi-Cloud Strategy
FAQ
Related Resources
Sources

AWS vs Azure GPU Pricing: Overview

AWS, Azure, and GCP dominate production GPU procurement, but their pricing strategies diverge significantly. AWS leads on H100 availability and spot pricing depth. Azure competes on reserved capacity for large commitments. GCP offers unique preemptible pricing for fault-tolerant workloads. Understanding GPU pricing across hyperscalers helps teams optimize cloud spend and avoid vendor lock-in as of March 2026.

The biggest cost factor is commitment level: on-demand vs. spot vs. reserved.

AWS GPU Instances & Pricing

Instance Families

AWS organizes GPUs into purpose-specific families:

P5 Family (Latest, Hopper + Grace)

p5.48xlarge: 8x H100 SXM GPUs
Status: Limited availability (2026)
Focus: LLM training, large-scale ML

P4d Family (A100, Ampere)

p4d.24xlarge: 8x A100 SXM GPUs (80GB)
Wide availability across regions
Primary instance for large-scale A100 training

G4 Family (T4 GPUs)

g4dn.xlarge: 1x T4
g4dn.2xlarge: 1x T4
Cost-efficient for inference and small training

G5 Family (A10G GPUs)

g5.xlarge: 1x A10G
g5.2xlarge: 1x A10G
Inference-optimized, 24GB VRAM per GPU

On-Demand Pricing

As of March 2026, AWS on-demand pricing (US East, N. Virginia):

Instance	GPUs	GPU Type	Hourly Rate
p5.48xlarge	8	H100 SXM	$98.32
p4d.24xlarge	8	A100 SXM	$21.96
g5.2xlarge	1	A10G	$1.01
g4dn.xlarge	1	T4	$0.35

Per-GPU cost for p5.48xlarge (on-demand): $98.32 ÷ 8 = $12.29/hour per H100 SXM

1-year reserved pricing for p5.48xlarge: $55.04/hour ($6.88/GPU)

Per-GPU cost for p4d.24xlarge: $21.96 ÷ 8 = $2.745/hour per A100 SXM

AWS H100 on p5 instances is 2-3x more expensive per GPU than specialty providers (RunPod at $2.69/hour for H100 SXM) but includes 400Gbps EFA networking, 1.1TB RAM, and deep AWS managed services integration (IAM, VPC, CloudWatch, SageMaker, etc.).

Spot Pricing

AWS spot prices fluctuate based on demand. March 2026 average spot prices (US East):

Instance	Spot Rate	Discount
p5.48xlarge	$29.50	~70% off on-demand
p4d.24xlarge	$6.59	70% off on-demand
g5.2xlarge	$0.30	70% off on-demand

Spot instances can be interrupted with 2-minute notice. Interruption frequency is ~1-5% annually depending on instance type.

Reserved Instances

AWS reserved instances (RIs) lock in rates for 1 or 3 years.

1-year reserved (p4d.24xlarge, 8x A100):

Upfront: ~$96,000
Effective hourly: $10.98 (50% discount)

3-year reserved (p4d.24xlarge, 8x A100):

Upfront: ~$150,000
Effective hourly: $5.93 (73% discount)

Monthly cost at 3-year rate: $5.93 × 730 hours = $4,329/month for 8x A100s.

Azure GPU Instances & Pricing

Instance Families

Azure organizes GPUs by workload:

ND A100 v4 Series (A100 training)

Standard_ND96asr_v4: 8x A100 SXM (80GB)
Focus: Distributed training, HPC

NC A100 v4 Series (A100, general ML)

Standard_NC24ads_A100_v4: 1x A100 80GB
Standard_NC96ads_A100_v4: 4x A100 80GB

NC H100 v5 Series (H100 training)

Standard_NC40ads_H100_v5: 1x H100 80GB
Standard_NC80adis_H100_v5: 2x H100 80GB

NV Family (T4, graphics/inference)

Standard_NV6: 1x T4
Standard_NV12: 2x T4

On-Demand Pricing

As of March 2026, Azure on-demand pricing (US East):

Instance	GPUs	GPU Type	Hourly Rate
Standard_ND96asr_v4	8	A100 SXM 80GB	$28.50
Standard_NC96ads_A100_v4	4	A100 80GB	$14.70
Standard_NC80adis_H100_v5	2	H100 80GB	$18.00
Standard_NV6	1	T4	$0.90

Per-GPU cost for ND96asr_v4: $28.50 ÷ 8 = $3.56/hour per A100 SXM

Azure is slightly cheaper than AWS on A100 but comparable on overall pricing.

Reserved Instances (Savings Plans)

Azure uses "Savings Plans" instead of traditional RIs. Commitment terms:

1-year savings plan (ND96asr_v4, 8x A100):

Rate: $15.21/hour for instance
Discount: 47% off on-demand

3-year savings plan:

Rate: $7.89/hour for instance
Discount: 72% off on-demand

Monthly cost at 3-year rate: $7.89 × 730 hours = $5,759/month for 8x A100s.

Spot Pricing

Azure "spot instances" pricing is similar to AWS:

Instance	Spot Rate	Discount
Standard_ND96asr_v4	$8.55	70% off on-demand
Standard_NC80adis_H100_v5	$5.40	70% off on-demand

Spot termination risk is comparable to AWS (1-5% annually).

GCP GPU Instances & Pricing

Instance Families

GCP organizes GPUs into machine types:

A2 VMs (High Memory)

a2-highgpu-8g: 8x A100 SXM GPUs
a2-highgpu-16g: 16x A100 SXM GPUs

A3 VMs (Latest, Hopper)

a3-megagpu-8g: 8x H100 GPUs
a3-megagpu-16g: 16x H100 GPUs

N1/N2 VMs (General-purpose with T4)

n1-standard-4: Up to 4x T4 GPUs

On-Demand Pricing

As of March 2026, GCP on-demand pricing (us-central1):

Instance	GPUs	GPU Type	Hourly Rate
a3-highgpu-8g	8	H100	$88.49
a2-highgpu-8g	8	A100	$35.20
a2-highgpu-1g	1	A100	$3.67
n1-standard-4 (1x T4)	1	T4	$0.35

Per-GPU cost for a3-highgpu-8g: $88.49 ÷ 8 = $11.06/hour per H100

GCP H100 on-demand pricing ($11.06/GPU, a3-highgpu-8g) is comparable to Azure ($11.06/GPU) and significantly cheaper than AWS ($12.29/GPU).

Preemptible VMs

GCP preemptible pricing (A3 H100 instances not eligible):

Instance	Preemptible Rate	Discount
a2-highgpu-8g	$10.56	~70% off on-demand
a2-highgpu-1g	$1.10	~70% off on-demand

Note: A3 (H100) instances do not support preemptible pricing on GCP.

Preemptible instances terminate every 24 hours automatically + can be interrupted at any time. The 90% discount makes them attractive for fault-tolerant batch workloads.

Committed Use Discounts (CUDs)

GCP offers 1 and 3-year commitment discounts:

1-year CUD (a3-megagpu-8g):

Rate: ~$59.29/hour
Discount: ~33% off on-demand

3-year CUD (a3-megagpu-8g):

Rate: ~$34.51/hour
Discount: ~61% off on-demand

Monthly cost at 3-year rate: $34.51 × 730 hours = $25,192/month for 8x H100s.

GCP's committed discounts bring H100 costs down significantly, but AWS 1-year reserved ($55.04/hr) and CoreWeave on-demand ($49.24/hr) remain more cost-effective for most teams.

On-Demand Pricing Comparison

A100 SXM (Data-parallel training, mid-scale)

Provider	Instance	Per-GPU Cost	Full Node Cost
AWS	p4d.24xlarge (8x A100 SXM)	$2.75	$21.96
Azure	ND96asr_v4 (8x A100 SXM)	$3.56	$28.50
GCP	a2-highgpu-8g (8x A100)	$4.40	$35.20
CoreWeave	8x A100 SXM	$2.70	$21.60

Winner for A100: CoreWeave at $2.70/GPU ($21.60 total) narrowly beats AWS p4d ($2.75/GPU, $21.96 total). Azure and GCP are significantly more expensive per GPU.

H100 SXM (LLM training, large-scale)

Provider	Instance	Per-GPU Cost	Full Node Cost
AWS	p5.48xlarge (8x H100 SXM)	$12.29	$98.32
Azure	ND H100 v5 (8x H100)	$11.06	$88.49
GCP	a3-highgpu-8g (8x H100)	$11.06	$88.49
CoreWeave	8x H100 SXM	$6.16	$49.24

Winner for H100 on-demand: CoreWeave is the cheapest at $6.16/GPU ($49.24 for 8x). Among hyperscalers, AWS and GCP are comparable (~$11/GPU), both cheaper than AWS on-demand ($12.29/GPU).

T4 GPU (Inference, development)

Provider	Instance	Per-GPU Cost	Full Node Cost
AWS	g4dn.xlarge (T4)	$0.35	$0.35
Azure	NV6 (T4)	$0.90	$0.90
GCP	n1-standard-4 (T4)	$0.35	$0.35

Winner for T4: AWS and GCP tied at $0.35. Azure is 2.5x more expensive.

Spot & Preemptible Pricing

Spot Pricing Comparison (8x H100 nodes)

| Provider | Node Type | On-Demand | Spot | Discount | |---|---|---|---| | AWS | p5.48xlarge (8x H100) | $98.32 | $29.50 | ~70% | | Azure | ND H100 v5 (8x H100) | $88.49 | $26.55 | ~70% | | GCP | a3-highgpu-8g (8x H100) | $88.49 | N/A | Not available |

Winner for deep discounts on H100: AWS and Azure spot pricing. GCP does not offer preemptible pricing on A3 H100 instances. For A100 workloads, GCP preemptible is available at ~70% off.

Risk & Interruption Rates

AWS spot: 1-5% interruption annually, varies by instance type and region. p4d instances are rare, so interruption risk is lower.

GCP preemptible: 100% interruption at 24-hour mark guaranteed. Additional interruptions possible but less frequent than AWS spot.

Azure spot: Similar to AWS, 1-5% annually.

For workloads that can handle interruptions (batch training with checkpointing), GCP preemptible offers 90% cost savings vs. on-demand.

Reserved Instance Discounts

3-Year Commitment Pricing

All three hyperscalers offer 70-73% discounts for 3-year commitments on GPU instances.

| Provider | Instance | On-Demand | 3-Year Rate | Monthly Cost | |---|---|---|---| | AWS | p5.48xlarge (8x H100 SXM) | $98.32 | ~$39.33 | $28,711 | | AWS | p4d.24xlarge (8x A100 SXM) | $21.96 | $6.59 | $4,810 | | Azure | ND96asr_v4 (8x A100 SXM) | $28.50 | $7.89 | $5,759 | | GCP | a3-highgpu-8g (8x H100) | $88.49 | ~$34.51 | $25,192 |

Winner for committed H100 use: GCP 3-year CUD at ~$34.51/hr ($25,192/month). AWS 1-year reserved at $55.04/hr ($40,179/month). CoreWeave on-demand at $49.24/hr ($35,945/month) is competitive without a long-term commitment.

Key insight: For H100 with long-term commitment, GCP 3-year CUD is cheapest among hyperscalers. For A100, Azure ND at $7.89/hr 3-year reserved is cheapest.

GPU Architecture Differences

NVIDIA Hopper (H100) vs Ampere (A100)

H100 advantages:

2x FP8 tensor throughput vs A100
Transformer-optimized architecture
80GB HBM3 memory (same capacity as A100 80GB, but 3.35 TB/s bandwidth vs 2.0 TB/s)
Better for inference and multi-billion parameter models

A100 advantages:

Older, more stable drivers and software
Lower per-GPU cost (GCP A100 = $4.40 vs. H100 = $11.06 on GCP)
Sufficient for most models under 100B parameters
Better supported in older ML frameworks

Recommendation: Use A100 if training models under 100B params. Use H100 for foundation models or inference at scale.

Memory Bandwidth

GPU	Memory	Bandwidth	Use Case
T4	16GB GDDR6	320 GB/s	Inference, small training
A10G	24GB GDDR6	600 GB/s	Inference, fine-tuning
A100	80GB HBM2e	2.0 TB/s	General training
H100 SXM	80GB HBM3	3.35 TB/s	LLM training, large batches

T4 is suitable only for inference and models fitting in 16GB. A100 and H100 are for serious training workloads.

Interconnect Quality

AWS p4d: 300 GB/s inter-GPU bandwidth (dual NVIDIA NVLink). Low all-reduce latency.

Azure ND: 300 GB/s inter-GPU bandwidth (similar to AWS). Comparable latency.

GCP a3: 400 GB/s inter-GPU bandwidth (newest NVLink). Best latency for distributed training.

For distributed training with 8+ GPUs, GCP's interconnect is marginally faster, but differences are small (1-5% wall-clock improvement in most cases).

Real-World Cost Scenarios

Scenario 1: One-Week LLM Fine-Tuning Project

Workload: Fine-tune LLAMA 2 70B model on company data for 100 hours of H100 compute.

AWS on-demand (p5.48xlarge, 8x H100):

100 hours ÷ 8 GPUs = 12.5 hours of node time
Cost: 12.5 × $98.32 = $1,229

GCP on-demand (a3-highgpu-8g, 8x H100):

100 hours ÷ 8 GPUs = 12.5 hours
Cost: 12.5 × $88.49 = $1,106

Azure on-demand (ND H100 v5, 8x H100):

12.5 × $88.49 = $1,106

Verdict: GCP and Azure are similarly priced ($1,106) for H100 fine-tuning on-demand; both are cheaper than AWS ($1,229). GCP does not offer preemptible on A3 H100 instances. CoreWeave ($49.24/hr) would cost $617 for the same workload.

Scenario 2: 3-Month LLM Training Project

Workload: Train custom LLM from scratch, 10,000 H100 hours over 3 months.

AWS on-demand (p5.48xlarge, H100):

10,000 ÷ 8 = 1,250 node-hours
Cost: 1,250 × $98.32 = $122,900

AWS 1-year reserved (commitment):

1-year rate: $55.04/hr
Cost: 1,250 × $55.04 = $68,800

GCP on-demand (a3-highgpu-8g, H100):

10,000 ÷ 8 = 1,250 node-hours
Cost: 1,250 × $88.49 = $110,613

GCP 1-year CUD (a3-highgpu-8g):

1-year rate: ~$59.29/hr (33% off)
Cost: 1,250 × $59.29 = $74,113

Verdict: AWS 1-year reserved ($68,800) is the most cost-effective hyperscaler option. GCP on-demand ($110,613) and Azure on-demand ($110,613) are similar. CoreWeave ($49.24/hr × 1,250 = $61,550) is the cheapest option for this scale.

Scenario 3: Production Inference Service (Ongoing)

Workload: Serve 1M inference requests/day on 70B param model. Requires ~5 A100 GPUs continuously.

AWS on-demand (g5.2xlarge, A10G):

5 × $1.01 (A10G on-demand) = $5.05/hour
Monthly: $5.05 × 730 = $3,686.50
Annual: $44,238

GCP on-demand (a2-highgpu-1g, 1 A100 = $3.67/hour, need 5):

5 × $3.67 = $18.35/hour
Monthly: $18.35 × 730 = $13,396
Annual: $160,748

GCP 3-year CUD (committed):

5 × $2.20/hour (3-year A100 rate) = $11.00/hour
Monthly: $11.00 × 730 = $8,030
Annual: $96,360

Verdict: AWS on-demand (A10G) is cheapest at $44,238/year for this inference use case. GCP 3-year commitment ($96,360) is more expensive than AWS A10G for this scenario but uses a more powerful A100 GPU. AWS spot on A10G instances would be even cheaper (~$13,000/year) but risks interruptions for production.

Multi-Cloud Strategy

Why Use Multiple Clouds?

1. Cost arbitrage: Spot pricing fluctuates. Spreading workloads across AWS and GCP reduces risk of all capacity being expensive.

2. Availability: If one cloud is capacity-constrained, the other may have cheaper rates. Especially important for H100 during high-demand periods.

3. Regional proximity: Teams in Europe may prefer Azure (strong European data center presence). APAC teams may prefer GCP.

4. Feature diversity: AWS offers managed services integration. GCP offers the deepest preemptible discounts. Azure offers best reserved rates.

Multi-Cloud Workload Distribution Example

Model training with uncertain timeline:

30% workload on AWS spot (for experimentation, prone to interruptions)
40% workload on GCP preemptible (deep discount, but requires checkpointing)
30% workload on AWS 3-year reserved (stable, predictable cost)

This mix reduces cost volatility and guarantees core capacity (reserved) while capturing discounts on flexible portions.

FAQ

Q: Which cloud has the cheapest H100? Among hyperscalers, GCP and Azure are comparable at $11.06/GPU ($88.49 for 8×H100), with AWS on-demand at $12.29/GPU ($98.32 for 8×H100). AWS 1-year reserved drops to $6.88/GPU ($55.04/hr). For specialty providers, CoreWeave at $6.16/GPU ($49.24/hr for 8x) is cheaper than hyperscalers. RunPod H100 SXM at $2.69/hr is the cheapest single-GPU option.

Q: Can I use spot pricing for production inference? Not recommended. Interruption rate is 1-5% annually, which translates to ~4-50 hours downtime per year. Most production systems can't tolerate this.

Q: What's the breakeven for reserved instances vs. spot? If spot interruption cost exceeds 30% of workload (due to restart overhead), reserved instances become cheaper. For most production workloads, reserved instances are justified.

Q: Does GCP preemptible work with distributed training? Yes, with multi-GPU checkpointing. Every 12-24 hours, preemptible instances terminate. Code must save model checkpoints to object storage and resume from the latest. Google Batch service handles this automatically.

Q: Can I move workloads between AWS, Azure, and GCP? Yes, if workloads are containerized (Docker). All three support Kubernetes. Framework code (PyTorch, JAX, TensorFlow) is portable.

Q: Which cloud is best for multi-year commitments on H100? GCP's 3-year CUD (~$34.51/hr for 8x H100) is the cheapest hyperscaler commitment option, offering ~61% off on-demand. AWS 1-year reserved ($55.04/hr) is a strong alternative with shorter commitment. For A100 workloads, Azure ND A100 v4 3-year reserved is the most cost-effective hyperscaler option.

Q: Are there hidden costs (data egress, storage)? Data egress from GPU instances costs $0.01-0.12 per GB depending on destination. Storage (EBS, persistent disks) costs $0.10-0.30/GB/month. These add 10-20% overhead for intensive ML workloads. Budget accordingly.

Sources

AWS EC2 Pricing. ec2instances.info/ (March 2026)
AWS EC2 Reserved Instances. aws.amazon.com/ec2/pricing/reserved-instances/ (March 2026)
Azure Virtual Machines Pricing. azure.microsoft.com/en-us/pricing/details/virtual-machines/windows/ (March 2026)
Google Cloud Compute Engine Pricing. cloud.google.com/compute/all-pricing (March 2026)
Google Cloud Committed Use Discounts. cloud.google.com/docs/cuds (March 2026)
AWS EC2 Spot Instances History. aws.amazon.com/ec2/spot/pricing-history/ (March 2026)

Contents

AWS vs Azure GPU Pricing: Overview

AWS GPU Instances & Pricing

Instance Families

On-Demand Pricing

Spot Pricing

Reserved Instances

Azure GPU Instances & Pricing

Instance Families

On-Demand Pricing

Reserved Instances (Savings Plans)

Spot Pricing

GCP GPU Instances & Pricing

Instance Families

On-Demand Pricing

Preemptible VMs

Committed Use Discounts (CUDs)

On-Demand Pricing Comparison

A100 SXM (Data-parallel training, mid-scale)

H100 SXM (LLM training, large-scale)

T4 GPU (Inference, development)

Spot & Preemptible Pricing

Spot Pricing Comparison (8x H100 nodes)

Risk & Interruption Rates

Reserved Instance Discounts

3-Year Commitment Pricing

GPU Architecture Differences

NVIDIA Hopper (H100) vs Ampere (A100)

Memory Bandwidth

Interconnect Quality

Real-World Cost Scenarios

Scenario 1: One-Week LLM Fine-Tuning Project

Scenario 2: 3-Month LLM Training Project

Scenario 3: Production Inference Service (Ongoing)

Multi-Cloud Strategy

Why Use Multiple Clouds?

Multi-Cloud Workload Distribution Example

FAQ

Related Resources

Sources