H100 on Paperspace: Pricing, Specs & How to Rent

Deploybase · May 22, 2025 · GPU Pricing

Contents

H100 Specs

  • 80GB HBM3 memory
  • 3.35 TB/s bandwidth (SXM), 2 TB/s (PCIe)
  • FP32: 67 TFLOPS
  • TF32: 661 TFLOPS
  • FP8: 3,958 TFLOPS

See H100 specs guide for details.

Paperspace H100 Pricing

Paperspace offers H100 access through its cloud GPU platform. As of March 2026, exact pricing varies based on machine configuration and region. Standard H100 instances run between $2.50 and $3.00 per hour depending on the exact specification and data center location.

Pricing breakdown by variant:

  • H100 PCIe: Typically $2.50-2.80/hour
  • H100 SXM: Typically $2.80-3.20/hour

Paperspace bundles storage, bandwidth, and compute into monthly plans. Users pay for active machine time, with hourly rates applied to any running instance. The platform offers both on-demand and pre-reserved capacity options.

For comparative pricing analysis, explore RunPod GPU pricing and Lambda GPU pricing to understand the full market.

How to Rent

  1. Sign up at paperspace.com
  2. Select H100 from GPU dropdown
  3. Pick region and config
  4. Launch instance
  5. SSH or Jupyter in

Pre-installed: PyTorch, TensorFlow, Jupyter. Good for fine-tuning, inference, training.

Comparison with Alternatives

Multiple providers offer H100 access. Each platform has distinct pricing models and feature sets.

ProviderH100 PCIeH100 SXMKey Strength
Paperspace$2.50-2.80/hr$2.80-3.20/hrUser-friendly interface
RunPod$1.99/hr$2.69/hrLowest prices
Lambda$2.86/hr$3.78/hrproduction support
CoreWeave$49.24 (8x)N/ABulk pricing

Paperspace positions itself for users prioritizing ease-of-use over lowest cost. The platform handles networking, storage, and scaling automatically, reducing operational complexity.

Performance Benchmarks

H100 performance varies by workload type. For LLM inference, a single H100 serves approximately 500-800 tokens per second depending on model size and batch size.

Training benchmarks show:

  • 7B parameter models: 1,200-1,400 tokens/sec
  • 13B parameter models: 900-1,100 tokens/sec
  • 70B parameter models: 250-400 tokens/sec

These metrics depend heavily on precision (FP32, TF32, or FP8), batch size, and sequence length. Mixed precision training yields substantial speedups over full FP32 approaches.

Memory efficiency varies by framework. PyTorch with gradient checkpointing reduces memory usage by 30-40% at the cost of modest compute overhead.

FAQ

How much does the H100 cost on Paperspace per month? A 24/7 running H100 PCIe instance costs approximately $1,800-2,000 monthly. Typical users run intermittent workloads, so actual costs fall lower.

Can I use a single H100 for multi-user inference? Yes, with careful batching and request queuing, one H100 handles 10-20 concurrent users depending on model size.

Does Paperspace offer reserved instances? Yes, discounts apply to pre-reserved monthly or annual commitments.

What operating systems does Paperspace support? Ubuntu Linux is standard. Windows is available but less common for ML workloads.

How is data transfer charged? Paperspace charges for outbound data transfer. Inbound typically runs free or at reduced rates.

Sources

  • NVIDIA H100 Tensor GPU Specifications (official NVIDIA documentation)
  • Paperspace Cloud GPU Pricing Documentation
  • MLPerf Benchmarks for GPU Accelerators