L40S on Lambda Labs: Pricing, Specs & How to Rent

Overview
L40S GPU Specifications
Lambda Labs L40S Pricing
How to Rent L40S on Lambda Labs
Performance Benchmarks
Lambda Labs vs Other Providers
FAQ
Related Resources
Sources

Overview

L40S on Lambda Labs would be solid for inference and computer vision, but Lambda Labs doesn't currently advertise L40S (as of March 2026). They're pushing A100, H100, and GH200 instead. Still, worth knowing the specs, alternatives, and deployment tradeoffs if developers are evaluating this GPU tier.

L40S GPU Specifications

The NVIDIA L40S handles inference and data center workloads well:

Memory: 48GB GDDR6
Memory Bandwidth: 864 GB/s
Compute Capability: SM89
Peak Tensor Performance (TF32): 366 TFLOPS (733 TFLOPS FP16)
Peak FP32 Performance: 91.6 TFLOPS
Power Consumption: 350W
Interconnect: PCIe 4.0 only (no NVLink)

The L40S excels at:

Image inference (ResNet, EfficientNet, Vision Transformers)
Video processing and encoding
Stable Diffusion and diffusion model inference
Object detection (YOLO, Faster R-CNN)
Natural language processing at 7B-13B scales

Lambda Labs L40S Pricing

Lambda Labs no longer advertises L40S pricing publicly. However, comparable alternatives available:

Lambda Labs A10 (similar tier): $0.86/hour
Lambda Labs A6000 (similar memory): $0.92/hour
Lambda Labs RTX A6000 (48GB memory match): $0.92/hour

For L40S rental elsewhere:

RunPod: L40S available at ~$0.79/hour
CoreWeave: Available in 8-GPU cluster at ~$2.25/GPU/hour
AWS EC2: Available on g6e instances at ~$1.50-2.00/GPU/hour
Azure: Not currently advertised

Lambda Labs pivoted to H100/GH200. L40S became less available on their platform.

How to Rent L40S on Lambda Labs

Option 1: Contact Lambda Labs Sales

L40S may be available through direct sales engagement:

Visit Lambda Labs website
Click "Request H100" or contact sales
Specify L40S requirements and budget
Receive custom pricing based on volume and duration

Option 2: Use Alternative Providers

Since Lambda Labs doesn't advertise L40S, consider:

RunPod: Check GPU marketplace for L40 or similar
Paperspace: Offers professional GPU tiers
Modal: Cloud platform with diverse GPU options
Amazon SageMaker: Pay-as-you-go inference endpoints

Step 1: Account Setup

Create account, verify email, add payment method. Takes 5 minutes.

Step 2: Select Hardware

Browse instances. No L40S? Filter for:

48GB+ memory
Inference workloads
Under $1.00/hour

Step 3: Deploy Instance

Select desired instance and configure:

Ubuntu 22.04 or 24.04 OS
CUDA 12.x runtime
Python 3.10+
SSH key for secure access

Step 4: Install ML Stack

Common stacks:

PyTorch or TensorFlow for training
vLLM for serving LLMs fast
TensorRT for optimized inference
Hugging Face transformers for model loading

Performance Benchmarks

L40S performance varies by workload:

Stable Diffusion Inference

Model: SDXL 1.0
Batch size: 4
Throughput: 8-12 images/minute
Latency: 5-8 seconds per image

Llama 2 7B Inference

Batch size: 32
Throughput: 800-1,200 tokens/second
Latency (p50): 15-20ms
Memory utilization: 85%

Vision Transformer Classification

ImageNet 1K classification
Batch size: 256
Throughput: 3,200 images/second
Latency: 80ms

Lambda Labs vs Other Providers

Comparison for inference workloads with similar memory:

Provider	GPU	Price/hr	Memory	Availability
Lambda Labs	A6000	$0.92	48GB	Available
Lambda Labs	A100	$1.48	80GB	Available
RunPod	L40	$0.69	48GB	High
CoreWeave	L40S bundle	$18 (8-pack)	384GB	Available

Lambda Labs A6000 offers the closest direct replacement. RunPod L40 provides lower cost for single GPU rental. For computer vision applications, L40S balances cost and performance effectively.

FAQ

Why did Lambda Labs discontinue L40S availability? As demand for large language models surged, Lambda Labs prioritized H100 and GH200 capacity. L40S remains a solid inference GPU, but market demand shifted to larger models requiring more memory.

Can I use L40S for model training? Yes, but suboptimally. L40S works for training models under 13B parameters. However, H100 or A100 GPUs provide better training performance. L40S is primarily designed for inference.

What's the total cost to run a Stable Diffusion service on L40S for one month? At $0.92/hour for comparable A6000 (RunPod L40: $0.69/hour), monthly cost is $503-666 for 24/7 operation (730 hours).

Is L40S compatible with CUDA and PyTorch? Yes. Full CUDA 12.x compatibility and PyTorch 2.x support exist. Code portability is excellent across inference platforms.

Which provider offers the best L40S alternative right now? RunPod's L40 ($0.69/hour) matches L40S memory at lower cost. For pure L40S, contact Lambda Labs sales or check smaller providers like Paperspace.

Contents