NVIDIA DGX B200 Cloud Pricing: Where to Rent & How Much It Costs

Deploybase · February 20, 2026 · GPU Pricing

Contents

Nvidia Dgx B200 Price: Overview

The NVIDIA DGX B200 cloud pricing varies significantly across providers, as of March 2026. This 8-GPU system represents one of the most expensive AI computing configurations available, making price comparison critical for machine learning infrastructure decisions.

The DGX B200 consists of eight NVIDIA B200 Tensor Core GPUs interconnected with 1,536 GB of HBM3e memory total (8 × 192 GB) and 14.4 TB/s aggregate NVLink 5 inter-GPU bandwidth. Teams considering this system must evaluate whether rental or purchase makes financial sense for their workload patterns.

DGX B200 Hardware Specifications

The DGX B200 provides 88 petaFLOPS of FP8 performance across its eight GPUs. Each B200 GPU features 192 GB of HBM3E memory, enabling training and inference of extremely large language models without requiring complicated memory optimization techniques.

The system delivers approximately 8.0 TB/s memory bandwidth per GPU (64 TB/s aggregate across all 8 GPUs), making it ideal for batch processing, long-context language model serving, and large-scale research workloads. The Blackwell architecture improves upon previous generations through enhanced sparsity support and increased compute density.

The DGX B200 includes 16,896 NVIDIA CUDA cores per GPU for general compute, plus dedicated tensor cores optimized for matrix multiplication. The NVLink 5 interconnect provides 1,800 GB/s bidirectional bandwidth per GPU to neighboring GPUs in the node.

On-Premises Purchase Pricing

NVIDIA DGX B200 on-premises systems cost between $275,000 and $300,000 USD as of March 2026. This represents the complete system including eight B200 GPUs, cooling infrastructure, power distribution, NVIDIA BlueField-3 DPU networking, and factory integration.

The price covers the dedicated 55U form factor chassis, dual 5500W power supplies, and all necessary cabling. Buyers also receive one year of NVIDIA hardware support. The system requires three-phase 208V or 480V power input and significant cooling capacity.

Additional costs beyond the purchase price include:

  • Installation and site preparation: $5,000-15,000
  • Network infrastructure upgrades: $2,000-10,000
  • Cooling system modifications: $3,000-20,000
  • Facility modifications: $1,000-5,000
  • Support contracts beyond year one: $20,000-30,000 annually

The total cost of ownership for a three-year deployment period reaches $330,000-365,000 when including these auxiliary expenses.

Cloud Provider Pricing Analysis

CoreWeave

CoreWeave offers DGX B200 configurations at $68.80 per hour as of March 2026. This represents the highest cloud pricing but includes managed infrastructure, automatic scaling, and network optimization. CoreWeave provides dedicated instances with no resource sharing.

A full month of continuous DGX B200 usage on CoreWeave costs approximately $49,920 (730 hours at $68.80/hour). For three-year workloads, the total rental expense would be $1.8 million before volume discounts.

CoreWeave includes bandwidth optimization and direct connections to major cloud regions. The pricing includes all power, cooling, and facility costs with no separate charges for these infrastructure elements.

RunPod

RunPod pricing for 8x B200 systems is $5.98 per B200 per hour, totaling $47.84 per hour for a full DGX configuration. This represents the lowest cloud pricing among major providers. Monthly continuous usage costs $34,933 (730 hours at $47.84/hour).

RunPod offers spot pricing options at approximately 40 percent discount for fault-tolerant workloads. Spot instances cost approximately $28.70 per hour, reducing three-year rental costs to roughly $1.06 million.

The RunPod platform includes integrated Jupyter notebooks, direct file storage access, and API management tools. However, pricing may include charges for bandwidth exceeding the free tier allocation.

Lambda Labs

Lambda Labs charges $6.08 per B200 GPU per hour, resulting in $48.64 per hour for an 8-GPU DGX B200 system. Monthly continuous cost totals approximately $35,507 (730 hours at $48.64/hour).

Lambda Labs includes 1 GB/s outbound bandwidth without charges. Usage exceeding this allocation costs $0.10 per GB. The platform provides multi-region availability and integrated monitoring dashboards.

Lambda requires prepayment for reserved capacity to lock in pricing. Annual prepayment provides approximately 25 percent discount, reducing hourly rates to approximately $36.48 for three-year commitments.

ROI Calculation Framework

Payback Period Methodology

Calculate break-even point by dividing total system cost by hourly rental premium:

Payback Period (months) = (Purchase Cost + Infrastructure Costs) / (Hourly Rental Rate × 730 hours monthly)

CoreWeave example: ($290,000 + $8,000) / ($68.80 × 730) = 5.8 months

This represents the continuous operating duration until cumulative rental costs exceed purchase cost.

Multi-Year ROI Analysis

Five-year ROI comparison between renting and purchasing:

Continuous usage scenario (730 hours monthly):

  • Rental cost (CoreWeave): $68.80/hour × 730 × 60 = $3.02 million
  • Purchase cost: $290,000 + $8,000 + (support $20,000 × 5) = $490,000
  • Five-year savings from purchase: $2.53 million
  • ROI: 516 percent

Intermittent usage scenario (100 hours monthly):

  • Rental cost: $68.80/hour × 100 × 60 = $412,800
  • Purchase cost: $490,000
  • Five-year savings from rental: $77,200
  • ROI: -15 percent (purchase not justified)

This analysis clearly demonstrates break-even at approximately 400 hours monthly utilization.

Capacity Planning for Growth

teams expecting utilization to grow should model multi-year scaling:

Year 1: 100 hours monthly rental = $50,240 annual Year 2: 200 hours monthly rental = $100,480 annual Year 3: 400 hours monthly rental = $200,960 annual Year 4-5: 730 hours monthly rental = $600,160 annual annually

Five-year cumulative rental: $1.35 million

Purchase at start of year 3 (when 400 hours monthly projected):

  • Year 3-5 ownership cost: $490,000 + $30,000 = $520,000
  • Savings from purchasing at year 3: $830,000

This projection demonstrates value of ownership decision timing based on projected growth.

Financing Options Analysis

Equipment Financing

Many financial institutions offer 24-60 month equipment financing for capital purchases:

$290,000 DGX B200 with 36-month financing at 6 percent:

  • Monthly payment: $8,656
  • Total paid: $311,616
  • Interest cost: $21,616

Financing enables preserving capital while building asset equity. Teams with strong cash flow prefer financing over outright purchase.

Lease-to-Own Arrangements

DGX B200 lease-to-own through certain providers:

36-month lease: $15,000/month = $540,000 total Equipment ownership after 36 months

Effective cost: $540,000 (higher than financing but includes service and support).

teams valuing service support and avoiding maintenance responsibility prefer lease-to-own despite higher cost.

GPU Cloud Commitments with Savings

CoreWeave 12-month commitment: $45,840/year ($3,820/month) CoreWeave annual prepayment: 25 percent discount available

Annual prepayment: $34,380/year

This hybrid approach combines predictable costs with rental flexibility, suitable for teams expecting consistent but not continuous usage.

Competitor Analysis: AMD Instinct MI300X Cluster

Hardware Comparison

AMD Instinct MI300X provides alternative to NVIDIA B200:

MI300X specifications:

  • 192 GB HBM3E per GPU
  • 1.5 exaFLOPS peak FP8 performance
  • 5.3 TB/s memory bandwidth
  • 384 tensor cores per GPU

8x MI300X cluster system costs approximately $240,000-260,000, competitive with B200 pricing.

Performance Comparison

MI300X throughput approaches B200 within 5-15 percent depending on workload optimization. Dense models show smaller gaps, sparse/MoE models show larger gaps favoring B200.

For Llama 4 70B inference, MI300X achieves approximately 900 tokens/second versus B200's 950 tokens/second (5 percent difference).

Cost Comparison

8x MI300X on CoreWeave costs approximately $54.24/hour, 21 percent cheaper than B200 ($68.80/hour).

Annual continuous cost: $474,580 (MI300X) versus $602,304 (B200)

For teams optimizing purely on cost, MI300X provides compelling alternative despite slightly lower performance.

Ecosystem Considerations

B200 benefits from mature CUDA ecosystem and widespread integration. MI300X requires ROCm development, less widely supported than CUDA.

MI300X adoption accelerates but remains substantially behind B200 in community support and optimization tooling.

Rent vs Buy Economics

The break-even point between renting and purchasing depends on monthly usage hours and the selected cloud provider.

CoreWeave Breakeven Analysis

Purchase cost of $290,000 divided by $68.80/hour = 4,213 hours of cloud usage. This equals approximately 5.8 months of continuous operation. Teams requiring more than 5-6 months of annual DGX B200 capacity should purchase rather than rent from CoreWeave.

For teams requiring the system for two to three years, purchase becomes economically dominant. After 18 months of continuous operation, cumulative rental costs ($1.24 million) exceed the purchase price plus infrastructure costs.

RunPod Breakeven Analysis

Purchase cost of $290,000 divided by $47.84/hour = 6,063 hours of cloud usage. This equals approximately 8.3 months of continuous operation. Teams requiring consistent monthly capacity throughout the year should evaluate purchase.

For batch processing workloads with intermittent usage, RunPod rental remains economical. An organization using the DGX B200 for 100 hours monthly costs only $4,784, or $57,408 annually. Over three years, this totals $172,224, well below purchase and maintenance costs.

Lambda Labs Breakeven Analysis

Purchase cost of $290,000 divided by $48.64/hour = 5,963 hours of cloud usage. This equals approximately 8.2 months of continuous operation. With reserved capacity discounts, Lambda becomes competitive with RunPod for longer-term commitments.

Annual reserved pricing of $40 per B200 per hour translates to $320 per hour annually. For 2,000 hours annually, the cost is $64,000 per year, or $192,000 over three years. This approaches purchase parity for consistent year-round usage.

Performance Benchmarks and Cost Analysis

The DGX B200 achieves approximately 120 tokens per second when serving a 70B parameter model with batch size 1. For batch size 8, throughput reaches approximately 950 tokens per second.

Running a 70B model continuously on RunPod B200 infrastructure costs approximately $0.0000505 per token at $47.84/hour with 950 tokens/second throughput. This represents significant savings compared to API-based inference services.

By comparison, DeepSeek V3 via API costs approximately $0.27 per million input tokens and $1.10 per million output tokens as of March 2026. Local DGX B200 inference becomes cost-advantageous for inference volumes exceeding approximately 1 million tokens daily.

Large batch inference workloads benefit most from DGX B200 rental. Processing 100 million tokens daily costs approximately $5.05 on DGX B200 infrastructure versus $27-110 through API services depending on input/output ratios.

Deployment Recommendations

teams with the following characteristics should consider DGX B200 rental:

  • Batch processing jobs exceeding 100 million tokens weekly
  • Custom model training requiring weeks of compute time
  • Latency-sensitive applications requiring sub-100ms response times
  • Security requirements prohibiting external API calls
  • Need for fine-tuning on proprietary data

teams with these characteristics should purchase DGX B200:

  • Continuous inference serving with >2,000 hours monthly utilization
  • Research and development teams requiring permanent infrastructure
  • Dedicated training pipelines operating 12+ months annually
  • Custom CUDA development requiring direct hardware control
  • Long-term cost optimization for growing usage patterns

teams with intermittent or experimental workloads should consider smaller GPU configurations first. A single B200 GPU on RunPod costs $5.98 per hour, enabling testing at reduced commitment and cost.

Cost Optimization Strategies

Spot Instance Usage

RunPod spot instances provide 40 percent pricing reductions for fault-tolerant workloads. Batch processing pipelines, model training checkpoints, and distributed inference jobs benefit most from spot instances.

Reserved Capacity Commitments

Lambda Labs and CoreWeave offer reserved capacity discounts for annual or longer commitments. Negotiating these discounts directly with sales teams can reduce effective hourly rates by 20-35 percent.

Right-Sizing for The Workload

Smaller B200 configurations (single GPU or 2-GPU systems) may provide sufficient throughput for many workloads. A single B200 costs one-eighth the price of the full DGX B200 system while still offering 192 GB HBM3e memory and 8.0 TB/s bandwidth.

Multi-Provider Approach

teams with varying workload characteristics can use multiple providers. Run experimental work on RunPod spot instances while reserving CoreWeave capacity for latency-sensitive production systems.

DGX B200 Alternative Configurations

teams unable to justify DGX B200 costs should consider alternatives:

4x B200 Systems

A 4-GPU B200 system costs approximately $150,000-160,000 for purchase. Cloud pricing runs $24/hour on RunPod. This configuration serves inference workloads for 30-40B parameter models with reasonable performance.

H100 Systems

A single H100 GPU costs $2.69/hour on RunPod versus $5.98/hour for B200. For inference on models below 70B parameters, H100 systems provide equivalent performance at lower cost.

A100 Systems

A100 GPUs cost $1.19/hour on RunPod and work well for models below 13B parameters. Fine-tuning and training smaller models remains economical on A100 infrastructure.

Selecting Between Providers

CoreWeave provides the most generous bandwidth allocations and dedicated networking options. Choose CoreWeave when network performance directly impacts the application throughput.

RunPod offers the lowest absolute pricing and spot instance discounts. Choose RunPod for batch workloads, experimentation, and budget-constrained scenarios.

Lambda Labs provides the best reserved pricing for predictable year-round usage. Choose Lambda for production systems requiring consistent capacity and long-term cost optimization.

FAQ

What is included in DGX B200 cloud pricing?

Cloud pricing includes all power, cooling, and facility costs. Individual providers may charge separately for outbound bandwidth exceeding free tiers. Storage and data transfer fees apply separately on all platforms.

Can I get discounts for longer commitment periods?

Yes. All major providers offer discounts for reserved capacity commitments ranging from 25-35 percent for annual prepayment. Contact sales teams for customized production pricing.

What is the typical utilization model for DGX B200 rental?

Most teams rent DGX B200 systems for 50-200 hours monthly. Continuous utilization is rare outside dedicated research institutions and large AI companies.

How does DGX B200 performance compare to multiple single B200 GPUs?

The DGX B200 provides integrated high-speed interconnect enabling coordinated computation across all eight GPUs. Renting eight individual B200 GPUs lacks this connectivity and cannot efficiently run distributed workloads.

What are typical cooling and power requirements for on-premises DGX B200?

The system requires 55kW maximum power draw and 85,000 BTU/hour cooling capacity. Facilities must provide three-phase 208V or 480V power input with 100-amp service minimum.

Is bandwidth a significant cost factor on cloud providers?

For inference-focused workloads, bandwidth costs are minimal. Training jobs transferring large datasets monthly may incur $500-2,000 in bandwidth charges depending on data volume and provider.

How does DGX B200 financing affect total cost of ownership?

36-month financing at 6 percent adds approximately $22,000 in interest costs to $290,000 purchase price, reaching $312,000 total. Combined with support ($100,000 over 5 years), five-year ownership cost reaches $412,000. This compares favorably to continuous rental at $3.02 million for identical utilization.

Should I consider older generation DGX A100 for cost savings?

DGX A100 costs $80,000-100,000 new, 65 percent cheaper than B200. However, DGX A100 throughput reaches approximately 600 tokens/second for 70B models versus B200's 950 tokens/second. Cost-adjusted performance actually favors B200, paying premium for superior throughput. Only select A100 if throughput can be reduced through smaller batch sizes.

For additional information about GPU pricing and infrastructure:

Sources

  • NVIDIA DGX B200 official specifications and pricing documentation
  • CoreWeave pricing as of March 2026
  • RunPod pricing as of March 2026
  • Lambda Labs pricing and documentation
  • DeepSeek V3 API pricing information
  • Industry analysis of GPU rental market dynamics