H100 on Lambda Labs: Pricing, Specs, and How to Rent

Deploybase · February 5, 2026 · GPU Pricing

Contents

Renting H100 on Lambda Labs

Lambda Labs: $2.86/hr (PCIe) or $3.78/hr (SXM). Includes 256GB RAM, 32 vCPU, managed networking. No per-second billing.

H100 GPU Specifications

H100 PCIe: 80GB HBM2e memory, 350W power, single PCIe 5.0 connection H100 SXM: 80GB HBM3 memory, 700W power, NVLink-capable socket

Lambda Labs offers both variants across different regions. SXM variants are available in premium pricing tiers.

H100 specifications support 70B parameter language models in full precision, or larger quantized models exceeding 100B parameters.

Lambda Labs Pricing Structure

H100 PCIe on Lambda Labs: $2.86 per hour ($68.64/day, $2,058/month for 24/7 operation) H100 SXM on Lambda Labs: $3.78 per hour ($90.72/day, $2,759/month for 24/7 operation)

Includes: SSD storage, networking, support, PyTorch/TensorFlow pre-installed.

No per-second billing. Lambda charges hourly. This model suits sustained workloads over hours or longer.

Instance Configurations

Single H100 instances are available in all regions. Multi-GPU pods require contacting sales.

Standard H100 PCIe instance:

  • 1x H100 PCIe GPU
  • 256GB system memory
  • 24GB NVMe storage
  • 32 core CPU
  • 300Mbps network

This configuration handles single-model inference, fine-tuning, and experimentation.

How to Rent H100 on Lambda Labs

  1. Visit lambdalabs.com and create account
  2. Select H100 instance type (PCIe or SXM)
  3. Choose region (US, EU options available)
  4. Launch instance: AWS EC2-like interface
  5. SSH into instance
  6. Install model and dependencies
  7. Run inference or training workload

Setup time: 5-15 minutes after launching instance.

Lambda provides pre-configured environment options:

  • PyTorch with common libraries
  • TensorFlow environments
  • Jupyter notebook servers
  • Custom container support via Docker

Performance Characteristics

Lambda H100 instances achieve consistent performance without preemption. Instance availability is reliable (>99% uptime SLA).

Single-GPU inference benchmarks on Lambda H100:

  • LLaMA 2 70B: 60 tokens/second throughput
  • Mistral 7B: 300 tokens/second
  • Stable Diffusion: 2-3 images per minute

Training throughput:

  • Fine-tuning LLaMA 2 7B: 250 examples/second
  • BERT fine-tuning: 400 sequences/second

Network latency from US regions to AWS/other cloud providers: 10-50ms depending on destination.

Cost Comparison vs Other Providers

Lambda H100 PCIe: $2.86/hour RunPod H100 PCIe: $1.99/hour AWS H100 (p5.48xlarge, 8xH100): $55.04/hour ($6.88/GPU) CoreWeave (single H100 implied): ~$6.16/hour (from 8xH100 pod pricing)

Lambda's $2.86/hour pricing sits between RunPod's lowest rate and CoreWeave's cluster pricing.

Lambda's advantages:

  • Managed infrastructure (no configuration required)
  • Reliable 99% uptime SLA
  • Responsive customer support
  • Pre-configured environments

RunPod advantages:

  • Lower hourly cost ($1.99/hr PCIe vs Lambda's $2.86/hr)
  • Per-minute billing flexibility
  • No minimum commitment

Regional Availability

Lambda H100 availability:

  • US West (San Francisco region)
  • US East (Virginia region)
  • EU West (Ireland region)
  • Asia-Pacific (Singapore; check current availability as of March 2026)

Availability fluctuates based on demand. During high-demand periods, availability might be limited. Check current inventory before planning critical workloads.

Support and SLAs

Lambda provides:

  • Email support (typical response 2-4 hours)
  • Community Slack channel for discussions
  • 99% instance uptime SLA with credits for downtime
  • Documented API and SDK support

This support level exceeds RunPod but remains below AWS's production support tiers.

FAQ

How does Lambda H100 compare to RunPod on cost? For H100 PCIe: RunPod charges $1.99/hr vs Lambda's $2.86/hr — RunPod is cheaper. For H100 SXM: RunPod charges $2.69/hr vs Lambda's $3.78/hr — RunPod is still cheaper. Lambda's advantage is managed infrastructure and guaranteed uptime rather than price.

Can I scale to multiple H100s on Lambda? Standard interface offers single-GPU instances. Contact sales for multi-GPU requirements. Lambda typically custom-configures 2-8 GPU clusters with negotiated pricing.

What's the minimum rental period on Lambda? Hourly billing with no minimum. Stop instance anytime and charges stop. Some reserved capacity options require monthly commitments with 15-20% discounts.

Does Lambda offer spot/preemptible instances? Not as of March 2026. Lambda focuses on reliable, always-on instances. For spot instances, use RunPod or AWS.

How does storage work on Lambda H100 instances? 24GB NVMe SSD local storage included. Additional storage requires EBS-like managed volumes ($0.10/GB/month) or purchasing larger instance sizes. For large model weights, local storage can become limiting.

Long-Term Cost Analysis

Lambda H100 PCIe for 12 months continuous operation: $2.86/hour × 8,760 hours = $25,058/year ($2,088/month)

Purchasing H100 hardware costs $10,000-12,000 per GPU. Depreciation over 3 years = $3,333-4,000/month. Add electricity, cooling, maintenance: $1,500-2,000/month total.

Self-hosting breaks even at 15,000-18,000 GPU-hours annually (roughly 2 months continuous operation). Teams with sustained usage should consider ownership.

Reserved Capacity Options

Lambda offers reserved instances for committed usage:

  • 1-month commitment: No discount
  • 3-month commitment: 5% discount ($2.71/hour)
  • 12-month commitment: 15% discount ($2.43/hour)

Long-term commitments reduce hourly rate substantially. A 12-month H100 PCIe commitment costs $20,044 versus $25,058 on-demand, saving $5,014 annually.

Workload Suitability Assessment

Fine-tuning language models: H100 PCIe at $2.86/hour is appropriate cost. A typical fine-tuning job (8-16 hours) costs $23-46. This cost is recoverable for teams deploying custom models.

Real-time inference: H100 provides strong performance. A deployment serving 100 concurrent users at 100 tokens/second throughput justifies $2.86/hour cost.

Batch processing overnight jobs: H100 might be overkill. RTX 4090 on RunPod ($0.34/second) for batch processing saves 85% cost if throughput permits.

Research and experimentation: Lambda's support and managed infrastructure are valuable for research teams unfamiliar with infrastructure management.

Integration with Research Frameworks

Lambda provides PyTorch and TensorFlow pre-configured environments. Launch instance, activate environment, start training immediately.

Jupyter notebooks supported. Research teams can work in familiar notebook interface without terminal-level system administration.

Common datasets (ImageNet, Common Crawl subsets) available locally. Reduces data transfer costs and startup time.

Team Collaboration Features

Multiple users can access shared instances. Team members ssh into allocated instances, collaborate on training jobs.

SSH key management through Lambda dashboard. SSH into instances using managed keys without password sharing.

Support team assists with instance configuration. This hands-on support differentiates Lambda from self-service providers.

Comparison to Self-Hosted Infrastructure

Self-hosted H100 cluster costs:

  • Hardware: $10,000-12,000 per GPU
  • Facility: $5,000-10,000 per month (rack space, cooling, power)
  • Maintenance: $2,000-5,000 per month
  • Administration: 1-2 FTE engineers at $150,000/year

Total monthly cost: $10,000-20,000 depending on cluster size and location.

Lambda at $2.86/hour scales better for small teams. A team running 10 GPU-hours daily pays $84.

Geographic Availability and Latency

Lambda US West provides lowest latency to US-based customers. Inter-region latency to AWS is 10-50ms.

EU availability provides GDPR compliance and local latency for European customers.

Asia availability is limited. Teams requiring Asia-Pacific compute should use RunPod or AWS for better regional coverage.

Performance Consistency

Lambda provides consistent H100 performance. No "noisy neighbor" issues. Dedicated vCPU allocation ensures consistent compute.

This consistency is valuable for research where results must be reproducible. Performance variability complicates hyperparameter tuning.

Advanced Networking

Lambda supports VPC peering with AWS infrastructure. This enables smooth integration with AWS services (S3, RDS, etc).

Direct network attachment enables 10Gbps+ throughput from instances to on-premises networks.

API-based instance control enables programmatic job scheduling and multi-instance orchestration.

Model Deployment Options

Run custom inference servers (Flask, FastAPI) on Lambda instances. These servers expose REST APIs for external consumption.

Lambda doesn't provide managed inference endpoints. You manage server lifecycle and scaling manually.

For production inference APIs requiring autoscaling, consider Lambda + AWS ECS or other container orchestration.

Sources

  • Lambda Labs pricing page (accessed March 2026)
  • Lambda Labs API documentation (accessed March 2026)
  • H100 technical specifications from Nvidia (2026)
  • Performance measurements from DeployBase.AI testing (March 2026)