Google Cloud GPU Pricing: Complete Guide for Every GPU (March 2026)

Google Cloud GPU Pricing Overview
GPU Instance Pricing by Type
TPU Pricing
Committed Use Discounts (CUDs)
Cost Optimization Strategies
Regional Pricing
Real-World Usage Examples
Google Cloud vs Competitors
Data Transfer Costs
Storage Pricing
Integration Benefits
FAQ
Related Resources
Sources

Google Cloud GPU Pricing Overview

Google Cloud offers diverse GPU and TPU options optimized for different ML workloads. Pricing is complex but often undercut by specialized providers.

As of March 2026, Google Cloud remains premium for raw GPU compute. However, TPU pricing is exceptional for TensorFlow workloads. The platform excels when using TensorFlow, AutoML, and integrated services.

This guide covers GPU pricing, TPU alternatives, and cost optimization strategies.

Google Cloud GPU Families

A2 instances: NVIDIA A100 (40-80GB), 1-16 per instance A3 instances: NVIDIA H100 (80GB), 8-16 per instance (new) G2 instances: NVIDIA L4 (24GB), 1-8 per instance N1/N2 instances with GPU: Mixed CPU + GPU options

A2 dominates for professional work. A3 is emerging but supply is limited. G2 handles inference. N1/N2 offer flexibility.

Pricing Structure

Google Cloud bills by the second with a 1-minute minimum. Resource commitment discounts apply (1-year, 3-year).

Additional costs include:

Storage: $0.020-0.030/GB/month
Egress: $0.12/GB to internet
IP addresses: $3.60/month if static

Plan for these ancillary costs. GPU hourly rates don't tell the whole story.

GPU Instance Pricing by Type

A2 Instances (A100 GPUs)

a2-highgpu-1g (1x A100 40GB):

On-demand: ~$3.67/hr (GPU + machine type combined)
Memory: 85GB RAM
CPUs: 12 vCPUs

a2-highgpu-8g (8x A100 40GB):

On-demand: ~$4.40/hr per GPU
Total: ~$35.20/hr for 8 GPUs
Memory: 680GB RAM
CPUs: 96 vCPUs

8-GPU instances are expensive at $4.40/hr per GPU. Single-GPU a2-highgpu-1g at $3.67/hr is actually competitive for developers who only need one A100.

Cost comparison (single A100):

Google Cloud a2-highgpu-1g: $3.67/hr
AWS p4d.24xlarge: $3.83/hr per GPU (8x minimum purchase)
RunPod: $1.39/hr
Vast.AI: $0.80-1.50/hr

Google Cloud allows single-GPU A100 rental without committing to 8 GPUs, unlike AWS. Specialized providers like RunPod and Vast.AI are still cheaper.

A3 Instances (H100 GPUs)

a3-highgpu-8g (8x H100 80GB):

On-demand: ~$11.06/hr per GPU
Total: ~$88.49/hr for 8 GPUs
Memory: 1152GB RAM
CPUs: 104 vCPUs

A3 pricing is among the most expensive for H100. Supply can be limited and availability fluctuates. Instances may take longer to provision than A2.

Cost comparison (per H100 GPU):

Google Cloud A3: $11.06/hr per GPU ($88.49/hr for 8)
AWS p5.48xlarge: ~$6.88/hr per GPU
RunPod: $2.69/hr
Vast.AI: $1.47-$2.00/hr

Specialized providers remain substantially cheaper than hyperscalers for H100.

G2 Instances (L4 GPUs)

g2-standard-4 (1x L4):

On-demand: ~$0.35/hr
Memory: 16GB RAM
CPUs: 4 vCPUs

L4 is cost-effective for inference. 24GB VRAM handles most models. Pricing is excellent for small inference workloads.

g2-standard-32 (8x L4):

On-demand: ~$2.80/hr per GPU
Total: ~$22.40/hr
Memory: 128GB RAM
CPUs: 32 vCPUs

8-GPU instances scale linearly. Still cheaper than A2/A3 but slower.

TPU Pricing

Google Cloud TPUs offer exceptional performance-per-dollar for TensorFlow workloads. This is where Google Cloud shines.

TPU v5e (Latest)

TPU v5e pod (8 cores):

On-demand: ~$2.50/hr
Memory: Shared TPU architecture
Optimal framework: TensorFlow

TPU v5e provides exceptional value. Roughly $0.31/hr per core. Compares to:

A100 on GCP at $3.67-4.40/hr (single to 8-GPU instances)
H100 on GCP at $3.30/hr per GPU (a3-megagpu-8g)

TPU v5e beats both on cost.

TPU v4e

TPU v4e pod (8 cores):

On-demand: ~$4.00/hr
Older generation, higher price per performance

Not recommended. v5e is better.

TPU v4

TPU v4 pod (8 cores):

On-demand: ~$8.00/hr per core
Premium for older technology

Legacy hardware. Avoid unless already deployed.

Committed Use Discounts (CUDs)

Google Cloud offers 1-year and 3-year commitments:

1-Year Commitment

A100 single (a2-highgpu-1g):

On-demand: $3.67/hr
1-year commitment: $2.75/hr (-25%)

Annual: $2.75 × 8,760 = $24,090

3-Year Commitment

A100 single (a2-highgpu-1g):

On-demand: $3.67/hr
3-year commitment: $2.20/hr (-40%)

3-year: $2.20 × 8,760 × 3 = $57,816

Significant savings for committed workloads.

TPU CUDs

TPU v5e pod (1-year):

On-demand: $2.50/hr
1-year commitment: $1.75/hr (-30%)

TPU commitments are steeper discounts than GPUs (30-40% vs 25-35%).

Cost Optimization Strategies

1. Use TPU for TensorFlow

TPU v5e at $2.50/hr is exceptional value for TensorFlow. If the workload can use TPU, use it. Cost per compute unit is lowest available.

2. Right-size for single workloads

Single A100 on a2-highgpu-1g ($3.67/hr) avoids paying for 8 GPUs when only one is needed. If parallelism is not needed, buy minimal resources.

3. Commit for predictable workloads

3-year commitments save 40%. If training runs continuously, commit upfront. One-off work? Use hourly.

4. Use Preemptible VMs

70% discount for interruptible instances. Development, testing, checkpointed training. Not for production serving.

5. Keep data regional

Egress to internet is $0.12/GB. Download 1TB? $120 cost. Keep data in Google Cloud Storage.

6. Batch processing

Process multiple jobs sequentially on one instance. Reduces startup overhead. Cost per job drops.

7. Monitor and optimize

Google Cloud has excellent cost monitoring tools. Set up alerts. Kill unused resources.

Regional Pricing

Google Cloud pricing varies by region:

us-central1: ~$3.67/hr for A100 (a2-highgpu-1g) us-east1: ~$3.67/hr for A100 europe-west1: ~$4.33/hr for A100 (+18%) asia-southeast1: ~$5.32/hr for A100 (+45%)

US regions are cheapest. Asia-Pacific is expensive. For non-latency-critical work, stick with US regions.

Real-World Usage Examples

Single A100 inference (1000 requests):

Google Cloud (a2-highgpu-1g): 30 min × $3.67/hr = $1.84
AWS p4d (8x, per GPU): 30 min × $3.83/hr = $1.91
Vast.AI: 30 min × $1.00/hr = $0.50

Google Cloud allows single-GPU rental; AWS p4d requires 8 GPUs. Vast.AI is cheapest overall.

8-GPU training (24 hours):

Google Cloud A2 (a2-highgpu-8g): 24 × $35.20 = $844
AWS p4d.24xlarge: 24 × $30.60 = $734
Vast.AI (8x H100): 24 × $20 = $480

AWS and Vast.AI cheaper for multi-GPU. Google Cloud A2 costs more due to pricing model.

TensorFlow training (24 hours):

Google Cloud TPU v5e (8 cores): 24 × $2.50 = $60
Google Cloud A100 (single): 24 × $3.67 = $88
AWS p4d.24xlarge: 24 × $30.60 = $734

TPU dominates on cost for TensorFlow workloads. Single A100 on GCP is affordable. AWS 8-GPU instance far more expensive.

Google Cloud vs Competitors

Single A100 hourly:

Google Cloud (a2-highgpu-1g): $3.67
AWS p4d.24xlarge: $3.83/GPU (8x minimum)
Azure ND96asr_v4: $3.56/GPU (8x minimum)
RunPod: $1.39
Vast.AI: $0.80-1.50

Google Cloud allows single-GPU A100 rental without buying an 8-GPU node. Still more expensive than specialized providers.

8-GPU A100 instance hourly:

Google Cloud A2 (a2-highgpu-8g): $35.20
AWS p4d.24xlarge: $30.60
Azure ND96asr_v4: $28.50
Vast.AI (8x): ~$11.20

AWS and Azure cheaper for multi-GPU hyperscaler options. Vast.AI significantly cheaper overall.

TPU v5e pod (8 cores):

Google Cloud: $2.50
Equivalent GPU: $20-30/hr
Google Cloud advantage: 8-12x cheaper

TPU is where Google Cloud dominates.

Data Transfer Costs

Google Cloud egress pricing is high:

To internet: $0.12/GB
Between regions: $0.02/GB
Within region: Free
To other Google services: Free

Downloading 10GB costs $1.20. Keeping data in Google Cloud is critical for cost management.

Storage Pricing

Google Cloud Storage costs vary by class:

Standard storage: $0.020/GB/month (~~$20/TB/year) Nearline: $0.010/GB/month (~~$10/TB/year) Coldline: $0.004/GB/month (~$4/TB/year)

Use standard for active data. Nearline for monthly access. Coldline for backups.

Integration Benefits

Google Cloud's advantage beyond raw pricing:

Vertex AI: MLOps platform integrated with compute
BigQuery: Massive data warehouse for ML pipelines
AutoML: Pre-built ML models
Looker: BI and visualization

If the workflow uses these, Google Cloud's efficiency increases. Pure ML compute? Use Vast.AI.

FAQ

Is Google Cloud cheaper than AWS for GPU?

For single A100, Google Cloud allows flexible single-GPU rental at $3.67/hr while AWS requires buying an 8-GPU node. For multi-GPU A100, AWS is cheaper. For H100, GCP's a3-highgpu-8g costs ~$88.49/hr (8 GPUs) — more expensive than AWS p5 at ~$55/hr. TPU work is dramatically cheaper on Google Cloud.

Should I use TPU or GPU on Google Cloud?

TPU if TensorFlow (use it). GPU otherwise. TPU v5e cost is unbeatable for TensorFlow training.

Can I use Google Cloud GPUs for inference?

Yes, but it's expensive. G2 instances are cheapest. Single A100 is reasonable. Vast.AI is cheaper for production inference.

How do I minimize Google Cloud costs?

Commit for predictable workloads. Use TPU for TensorFlow. Right-size instances. Use Preemptible VMs for non-critical work. Keep data regional.

What's the best GPU for cost on Google Cloud?

Single A100 (a2-highgpu-1g) at $3.67/hr for single-GPU workloads where you'd otherwise pay for 8 GPUs on AWS/Azure. H100 on a3-highgpu-8g at ~$88.49/hr for 8-GPU training (note: expensive vs specialized providers). TPU v5e at $2.50/hr for TensorFlow. G2 (L4) instances for cost-efficient inference.

Can I switch between regions?

Yes, but data transfer between regions costs $0.02/GB. Keep workloads regional.

How do I estimate monthly costs?

(GPU hours × hourly rate) + storage + data transfer. Google Cloud's pricing calculator helps. Budget conservatively.

Sources

Google Cloud Compute Engine Pricing (as of March 2026)
Google Cloud TPU Pricing Documentation
Instance Type Specifications
Performance Benchmarks (March 2026)

Last updated: March 2026. Pricing reflects market rates as of March 22, 2026.

Contents