L40S on Lambda Labs: Pricing, Specs & How to Rent

Deploybase · April 21, 2025 · GPU Pricing

Contents

Overview

L40S on Lambda Labs would be solid for inference and computer vision, but Lambda Labs doesn't currently advertise L40S (as of March 2026). They're pushing A100, H100, and GH200 instead. Still, worth knowing the specs, alternatives, and deployment tradeoffs if developers're evaluating this GPU tier.

L40S GPU Specifications

The NVIDIA L40S handles inference and data center workloads well:

  • Memory: 48GB GDDR6
  • Memory Bandwidth: 864 GB/s
  • Compute Capability: SM89
  • Peak Tensor Performance (TF32): 366 TFLOPS (733 TFLOPS FP16)
  • Peak FP32 Performance: 91.6 TFLOPS
  • Power Consumption: 350W
  • Interconnect: PCIe 4.0 only (no NVLink)

The L40S excels at:

  • Image inference (ResNet, EfficientNet, Vision Transformers)
  • Video processing and encoding
  • Stable Diffusion and diffusion model inference
  • Object detection (YOLO, Faster R-CNN)
  • Natural language processing at 7B-13B scales

Lambda Labs L40S Pricing

Lambda Labs no longer advertises L40S pricing publicly. However, comparable alternatives available:

  • Lambda Labs A10 (similar tier): $0.86/hour
  • Lambda Labs A6000 (similar memory): $0.92/hour
  • Lambda Labs RTX A6000 (48GB memory match): $0.92/hour

For L40S rental elsewhere:

  • RunPod: L40S available at ~$0.79/hour
  • CoreWeave: Available in 8-GPU cluster at ~$2.25/GPU/hour
  • AWS EC2: Available on g6e instances at ~$1.50-2.00/GPU/hour
  • Azure: Not currently advertised

Lambda Labs pivoted to H100/GH200. L40S became less available on their platform.

How to Rent L40S on Lambda Labs

Option 1: Contact Lambda Labs Sales

L40S may be available through direct sales engagement:

  1. Visit Lambda Labs website
  2. Click "Request H100" or contact sales
  3. Specify L40S requirements and budget
  4. Receive custom pricing based on volume and duration

Option 2: Use Alternative Providers

Since Lambda Labs doesn't advertise L40S, consider:

  • RunPod: Check GPU marketplace for L40 or similar
  • Paperspace: Offers professional GPU tiers
  • Modal: Cloud platform with diverse GPU options
  • Amazon SageMaker: Pay-as-developers-go inference endpoints

Step 1: Account Setup

Create account, verify email, add payment method. Takes 5 minutes.

Step 2: Select Hardware

Browse instances. No L40S? Filter for:

  • 48GB+ memory
  • Inference workloads
  • Under $1.00/hour

Step 3: Deploy Instance

Select desired instance and configure:

  • Ubuntu 22.04 or 24.04 OS
  • CUDA 12.x runtime
  • Python 3.10+
  • SSH key for secure access

Step 4: Install ML Stack

Common stacks:

  • PyTorch or TensorFlow for training
  • vLLM for serving LLMs fast
  • TensorRT for optimized inference
  • Hugging Face transformers for model loading

Performance Benchmarks

L40S performance varies by workload:

Stable Diffusion Inference

  • Model: SDXL 1.0
  • Batch size: 4
  • Throughput: 8-12 images/minute
  • Latency: 5-8 seconds per image

Llama 2 7B Inference

  • Batch size: 32
  • Throughput: 800-1,200 tokens/second
  • Latency (p50): 15-20ms
  • Memory utilization: 85%

Vision Transformer Classification

  • ImageNet 1K classification
  • Batch size: 256
  • Throughput: 3,200 images/second
  • Latency: 80ms

Lambda Labs vs Other Providers

Comparison for inference workloads with similar memory:

ProviderGPUPrice/hrMemoryAvailability
Lambda LabsA6000$0.9248GBAvailable
Lambda LabsA100$1.4880GBAvailable
RunPodL40$0.6948GBHigh
CoreWeaveL40S bundle$18 (8-pack)384GBAvailable

Lambda Labs A6000 offers the closest direct replacement. RunPod L40 provides lower cost for single GPU rental. For computer vision applications, L40S balances cost and performance effectively.

FAQ

Why did Lambda Labs discontinue L40S availability? As demand for large language models surged, Lambda Labs prioritized H100 and GH200 capacity. L40S remains a solid inference GPU, but market demand shifted to larger models requiring more memory.

Can I use L40S for model training? Yes, but suboptimally. L40S works for training models under 13B parameters. However, H100 or A100 GPUs provide better training performance. L40S is primarily designed for inference.

What's the total cost to run a Stable Diffusion service on L40S for one month? At $0.92/hour for comparable A6000 (RunPod L40: $0.69/hour), monthly cost is $503-666 for 24/7 operation (730 hours).

Is L40S compatible with CUDA and PyTorch? Yes. Full CUDA 12.x compatibility and PyTorch 2.x support exist. Code portability is excellent across inference platforms.

Which provider offers the best L40S alternative right now? RunPod's L40 ($0.69/hour) matches L40S memory at lower cost. For pure L40S, contact Lambda Labs sales or check smaller providers like Paperspace.

Sources