Contents
Renting H100 on Lambda Labs
Lambda Labs: $2.86/hr (PCIe) or $3.78/hr (SXM). Includes 256GB RAM, 32 vCPU, managed networking. No per-second billing.
H100 GPU Specifications
H100 PCIe: 80GB HBM2e memory, 350W power, single PCIe 5.0 connection H100 SXM: 80GB HBM3 memory, 700W power, NVLink-capable socket
Lambda Labs offers both variants across different regions. SXM variants are available in premium pricing tiers.
H100 specifications support 70B parameter language models in full precision, or larger quantized models exceeding 100B parameters.
Lambda Labs Pricing Structure
H100 PCIe on Lambda Labs: $2.86 per hour ($68.64/day, $2,058/month for 24/7 operation) H100 SXM on Lambda Labs: $3.78 per hour ($90.72/day, $2,759/month for 24/7 operation)
Includes: SSD storage, networking, support, PyTorch/TensorFlow pre-installed.
No per-second billing. Lambda charges hourly. This model suits sustained workloads over hours or longer.
Instance Configurations
Single H100 instances are available in all regions. Multi-GPU pods require contacting sales.
Standard H100 PCIe instance:
- 1x H100 PCIe GPU
- 256GB system memory
- 24GB NVMe storage
- 32 core CPU
- 300Mbps network
This configuration handles single-model inference, fine-tuning, and experimentation.
How to Rent H100 on Lambda Labs
- Visit lambdalabs.com and create account
- Select H100 instance type (PCIe or SXM)
- Choose region (US, EU options available)
- Launch instance: AWS EC2-like interface
- SSH into instance
- Install model and dependencies
- Run inference or training workload
Setup time: 5-15 minutes after launching instance.
Lambda provides pre-configured environment options:
- PyTorch with common libraries
- TensorFlow environments
- Jupyter notebook servers
- Custom container support via Docker
Performance Characteristics
Lambda H100 instances achieve consistent performance without preemption. Instance availability is reliable (>99% uptime SLA).
Single-GPU inference benchmarks on Lambda H100:
- LLaMA 2 70B: 60 tokens/second throughput
- Mistral 7B: 300 tokens/second
- Stable Diffusion: 2-3 images per minute
Training throughput:
- Fine-tuning LLaMA 2 7B: 250 examples/second
- BERT fine-tuning: 400 sequences/second
Network latency from US regions to AWS/other cloud providers: 10-50ms depending on destination.
Cost Comparison vs Other Providers
Lambda H100 PCIe: $2.86/hour RunPod H100 PCIe: $1.99/hour AWS H100 (p5.48xlarge, 8xH100): $55.04/hour ($6.88/GPU) CoreWeave (single H100 implied): ~$6.16/hour (from 8xH100 pod pricing)
Lambda's $2.86/hour pricing sits between RunPod's lowest rate and CoreWeave's cluster pricing.
Lambda's advantages:
- Managed infrastructure (no configuration required)
- Reliable 99% uptime SLA
- Responsive customer support
- Pre-configured environments
RunPod advantages:
- Lower hourly cost ($1.99/hr PCIe vs Lambda's $2.86/hr)
- Per-minute billing flexibility
- No minimum commitment
Regional Availability
Lambda H100 availability:
- US West (San Francisco region)
- US East (Virginia region)
- EU West (Ireland region)
- Asia-Pacific (Singapore; check current availability as of March 2026)
Availability fluctuates based on demand. During high-demand periods, availability might be limited. Check current inventory before planning critical workloads.
Support and SLAs
Lambda provides:
- Email support (typical response 2-4 hours)
- Community Slack channel for discussions
- 99% instance uptime SLA with credits for downtime
- Documented API and SDK support
This support level exceeds RunPod but remains below AWS's production support tiers.
FAQ
How does Lambda H100 compare to RunPod on cost? For H100 PCIe: RunPod charges $1.99/hr vs Lambda's $2.86/hr — RunPod is cheaper. For H100 SXM: RunPod charges $2.69/hr vs Lambda's $3.78/hr — RunPod is still cheaper. Lambda's advantage is managed infrastructure and guaranteed uptime rather than price.
Can I scale to multiple H100s on Lambda? Standard interface offers single-GPU instances. Contact sales for multi-GPU requirements. Lambda typically custom-configures 2-8 GPU clusters with negotiated pricing.
What's the minimum rental period on Lambda? Hourly billing with no minimum. Stop instance anytime and charges stop. Some reserved capacity options require monthly commitments with 15-20% discounts.
Does Lambda offer spot/preemptible instances? Not as of March 2026. Lambda focuses on reliable, always-on instances. For spot instances, use RunPod or AWS.
How does storage work on Lambda H100 instances? 24GB NVMe SSD local storage included. Additional storage requires EBS-like managed volumes ($0.10/GB/month) or purchasing larger instance sizes. For large model weights, local storage can become limiting.
Long-Term Cost Analysis
Lambda H100 PCIe for 12 months continuous operation: $2.86/hour × 8,760 hours = $25,058/year ($2,088/month)
Purchasing H100 hardware costs $10,000-12,000 per GPU. Depreciation over 3 years = $3,333-4,000/month. Add electricity, cooling, maintenance: $1,500-2,000/month total.
Self-hosting breaks even at 15,000-18,000 GPU-hours annually (roughly 2 months continuous operation). Teams with sustained usage should consider ownership.
Reserved Capacity Options
Lambda offers reserved instances for committed usage:
- 1-month commitment: No discount
- 3-month commitment: 5% discount ($2.71/hour)
- 12-month commitment: 15% discount ($2.43/hour)
Long-term commitments reduce hourly rate substantially. A 12-month H100 PCIe commitment costs $20,044 versus $25,058 on-demand, saving $5,014 annually.
Workload Suitability Assessment
Fine-tuning language models: H100 PCIe at $2.86/hour is appropriate cost. A typical fine-tuning job (8-16 hours) costs $23-46. This cost is recoverable for teams deploying custom models.
Real-time inference: H100 provides strong performance. A deployment serving 100 concurrent users at 100 tokens/second throughput justifies $2.86/hour cost.
Batch processing overnight jobs: H100 might be overkill. RTX 4090 on RunPod ($0.34/second) for batch processing saves 85% cost if throughput permits.
Research and experimentation: Lambda's support and managed infrastructure are valuable for research teams unfamiliar with infrastructure management.
Integration with Research Frameworks
Lambda provides PyTorch and TensorFlow pre-configured environments. Launch instance, activate environment, start training immediately.
Jupyter notebooks supported. Research teams can work in familiar notebook interface without terminal-level system administration.
Common datasets (ImageNet, Common Crawl subsets) available locally. Reduces data transfer costs and startup time.
Team Collaboration Features
Multiple users can access shared instances. Team members ssh into allocated instances, collaborate on training jobs.
SSH key management through Lambda dashboard. SSH into instances using managed keys without password sharing.
Support team assists with instance configuration. This hands-on support differentiates Lambda from self-service providers.
Comparison to Self-Hosted Infrastructure
Self-hosted H100 cluster costs:
- Hardware: $10,000-12,000 per GPU
- Facility: $5,000-10,000 per month (rack space, cooling, power)
- Maintenance: $2,000-5,000 per month
- Administration: 1-2 FTE engineers at $150,000/year
Total monthly cost: $10,000-20,000 depending on cluster size and location.
Lambda at $2.86/hour scales better for small teams. A team running 10 GPU-hours daily pays $84.
Geographic Availability and Latency
Lambda US West provides lowest latency to US-based customers. Inter-region latency to AWS is 10-50ms.
EU availability provides GDPR compliance and local latency for European customers.
Asia availability is limited. Teams requiring Asia-Pacific compute should use RunPod or AWS for better regional coverage.
Performance Consistency
Lambda provides consistent H100 performance. No "noisy neighbor" issues. Dedicated vCPU allocation ensures consistent compute.
This consistency is valuable for research where results must be reproducible. Performance variability complicates hyperparameter tuning.
Advanced Networking
Lambda supports VPC peering with AWS infrastructure. This enables smooth integration with AWS services (S3, RDS, etc).
Direct network attachment enables 10Gbps+ throughput from instances to on-premises networks.
API-based instance control enables programmatic job scheduling and multi-instance orchestration.
Model Deployment Options
Run custom inference servers (Flask, FastAPI) on Lambda instances. These servers expose REST APIs for external consumption.
Lambda doesn't provide managed inference endpoints. You manage server lifecycle and scaling manually.
For production inference APIs requiring autoscaling, consider Lambda + AWS ECS or other container orchestration.
Related Resources
- Lambda Labs GPU Pricing
- NVIDIA H100 Price
- RunPod GPU Pricing
- CoreWeave GPU Pricing
- Modal vs RunPod Serverless Comparison
Sources
- Lambda Labs pricing page (accessed March 2026)
- Lambda Labs API documentation (accessed March 2026)
- H100 technical specifications from Nvidia (2026)
- Performance measurements from DeployBase.AI testing (March 2026)