H100 Vast.AI: Marketplace Pricing, Peer-to-Peer GPU Rental, and Bidding Strategy

Deploybase · February 12, 2025 · GPU Pricing

Contents

H100 on Vast.AI

Vast.AI: $2.50-4.00/hr for peer-to-peer H100 rentals. Cheapest sometimes, but pick providers carefully (uptime matters). Average: $3.15/hr.

This covers marketplace dynamics, pricing, bidding, and provider selection.

Vast.AI Marketplace Overview

As of March 2026, Vast.AI differs fundamentally from traditional cloud providers. Instead of fixed pricing, GPU owners list hardware at self-determined rates. Renters bid for access, creating a market-driven pricing system.

H100 Pricing Patterns and Monthly Projections

Current H100 availability on Vast.AI shows:

TierPrice RangeMonthly (730 hrs)AvailabilityReliability
Budget$2.50-3.00/hr$1,825-2,19020-40 listingsVariable provider quality
Mid-Market$3.00-3.50/hr$2,190-2,55560-100 listingsMixed reliability
Premium$3.50-4.00/hr$2,555-2,92040-60 listingsHigher uptime SLAs
Production$4.00+/hr$2,920+5-15 listingsDedicated support

Pricing updates hourly based on actual renter activity. Check Vast.AI's price history dashboard for long-term trends before committing to multi-day workloads. Average market rate across all tiers: $3.15/hr, yielding monthly cost of ~$2,300.

Performance Benchmarks on Vast.AI

H100 performance varies by provider hardware pairing and network quality:

MetricBudget TierMid-Market TierPremium Tier
Inference Throughput (70B model)35-40 tokens/sec40-45 tokens/sec45-50 tokens/sec
Network Latency100-500ms50-150ms<50ms
Training Throughput (fine-tuning)200-300 tokens/sec300-400 tokens/sec400-500 tokens/sec
Uptime (measured)85-92%92-97%97-99%

Provider network quality and CPU pairing significantly impact GPU throughput, up to 40% variance between budget and premium tiers.

Provider Selection and Risk Assessment

Evaluating Provider Reliability

Vast.AI displays provider metrics critical to risk assessment:

  • GPU Uptime Score: Historical availability percentage (aim for >98%)
  • Internet Speed: Measured upload/download bandwidth to instance (aim for >100 Mbps)
  • Renter Reviews: Qualitative feedback on stability and responsiveness (4.5+ stars)
  • Hardware Specifications: GPU type, CPU pairing, storage options (verify H100 specifically, not H100X)
  • Rental Hours: Provider's experience with long-term renters (>100 hours minimum)

Filter for providers with 100+ rental hours and >97% uptime score. New providers with <20 hours history should be avoided unless pricing discount exceeds 30%. Read recent reviews (last 10) for patterns: single complaints are normal; repeated issues indicate systemic problems.

Geolocation, Latency, and Network Quality

Vast.AI shows provider location (typically US, Europe, or Asia-Pacific). Selection criteria vary by workload:

  • Interactive inference: Prefer same-region provider with <50ms latency to avoid round-trip delays
  • Batch training: Geolocation irrelevant; prioritize pricing and uptime score
  • Data transfer: Providers with 500+ Mbps upload enable fast dataset transfers; verify in provider specs

Test network quality before committing to multi-day jobs: iperf3 -c provider-ip from local machine to measure actual bandwidth available.

Bidding Strategy and Cost Optimization

Fixed-Price Versus Bid-Based Rental

Vast.AI offers two rental modes:

  1. Interruptible: Cheaper but can be terminated with 4-hour notice when provider needs GPU
  2. On-Demand Fixed: Higher cost but guaranteed until developers terminate

For training jobs with checkpointing, interruptible instances at 30-40% discount versus on-demand save substantial costs. For production inference, fixed-price instances ensure reliability.

Dynamic Bidding Approach

Instead of accepting listed prices, bid below asking rate. The platform accepts bids from providers with excess capacity:

  1. Note median H100 asking price (typically $3.20-3.50/hr)
  2. Set bid limit at 80-85% of median
  3. Wait during off-peak hours (2-6 AM UTC typically show best fill rates)
  4. Monitor bid acceptance rate; if consistently rejected, increase to 90% of median

This approach can reduce effective H100 cost to $2.70-3.00/hr for flexible-timing workloads.

Detailed Setup Walkthrough

Instance Launch Procedure

  1. Browse available H100 listings filtered by price, location, uptime score
  2. Select "Rent" or place custom bid:
    • For on-demand fixed: Click "Rent Now" at listed price
    • For interruptible: Can bid lower; fill rate depends on provider excess capacity
  3. Choose storage configuration (provider may offer 50-500GB templates)
  4. Generate SSH keys or use existing public key
    • First time: Download private key securely; do not share
    • Subsequent: Use existing key for same provider
  5. Select 24-hour or 168-hour billing cycle
  6. Instance launches within 2-10 minutes (check status in dashboard)
  7. Connect via SSH using provided IP address immediately upon launch

Typical Instance Launch Timeline

StepDurationStatus
Payment processing30 secondsQueued
Provider instance allocation1-3 minutesLaunching
Container/OS initialization2-5 minutesStarting
SSH accessibility5-10 minutes totalRunning

Total typical time: 8 minutes from launch click to SSH access.

SSH Configuration and Remote Access

Vast.AI assigns public IPs with random high-numbered SSH ports. Store connection details in ~/.ssh/config for quick access:

Host vastai-h100
  HostName provider-ip.address
  Port 12345
  User root
  IdentityFile ~/.ssh/vastai_key
  ServerAliveInterval 60
  ServerAliveCountMax 3

ssh vastai-h100

ssh -L 8888:localhost:8888 vastai-h100

Vast.AI assigns random high-numbered ports (>10000) to avoid port conflicts. Update ~/.ssh/config after each instance launch with new IP and port.

Data Transfer Optimization

Upload training data before long-running jobs using rsync for resume capability:

rsync -av --progress --partial \
  /local/training_data/ \
  root@provider-ip:~/data/training_data/

rsync -av --progress --partial \
  /local/training_data/ \
  root@provider-ip:~/data/training_data/

Vast.AI's network performance varies by provider:check bandwidth metrics in provider listing before transferring multi-GB datasets. Budget Tier providers typically offer 50-100 Mbps; Premium Tier offer 500+ Mbps. Network speed variance can cause 5-10x difference in data loading time.

Workload Optimization for Marketplace Constraints

Checkpoint-Based Training

Always enable checkpointing for Vast.AI jobs to tolerate instance interruptions. Save model weights every 500 training steps:

if step % 500 == 0:
    torch.save({
        'epoch': epoch,
        'model_state': model.state_dict(),
        'optimizer_state': optimizer.state_dict(),
    }, f'checkpoint_step_{step}.pt')

Batch Inference with Timeout Handling

For inference workloads, implement timeout mechanisms and provider failover:

import signal

def timeout_handler(signum, frame):
    # Save inference results and cleanly exit
    print("Instance termination imminent, saving results...")
    save_results()
    sys.exit(0)

signal.signal(signal.SIGALRM, timeout_handler)
signal.alarm(3600)  # 1-hour timeout warning

Cost Optimization on Vast.AI

Bidding Strategy for Lower Effective Costs

Rather than accepting listed prices, submit strategic bids 20-30% below asking rates:

import requests
from datetime import datetime

def get_market_stats():
    # Query Vast.ai's price history API
    # Identify off-peak windows and price trends

typical_h100_asking = 3.25  # Mid-market average
bid_price = typical_h100_asking * 0.75  # Bid 25% below asking

Effective cost with strategic bidding: $2.45/hr average (down from $3.25 asking), reducing monthly cost to ~$1,790.

Provider Portfolio Diversification

Maintain relationships with 3-4 premium providers to minimize interruption risk:

providers = [
    {"name": "provider-a", "price": 3.40, "uptime": 0.98},
    {"name": "provider-b", "price": 3.55, "uptime": 0.99},
    {"name": "provider-c", "price": 3.35, "uptime": 0.97},
]

selected = min(providers, key=lambda p: p["price"] / p["uptime"])

Spreading workload across providers ensures fallback capacity if one experiences unavailability.

Comparing Vast.AI to Dedicated Providers

Vast.ai's H100 average cost ($3.00-3.50/hr) is above RunPod H100 SXM at $2.69/hr on-demand and comparable to Lambda H100 SXM at $3.78/hr, but competitive with strategic bidding ($2.44-2.70/hr effective). Key differences:

MetricVast.AIRunPodLambdaCoreWeave
Typical H100 Cost$3.25$2.69$3.78 (SXM)$6.16
ReliabilityVariable (85-99%)GuaranteedGuaranteedGuaranteed
SupportPeer-to-peerTechnicalTechnicalTechnical
Setup Time5-15 min2-5 min3-8 min10-15 min
  • Reliability: Dedicated providers guarantee availability; Vast.AI requires provider risk management (cost: 2-5% lost time to interruptions)
  • Support: RunPod and Lambda offer technical support; Vast.AI provides peer-to-peer only
  • Tooling: Dedicated providers include optimized templates and integrations; Vast.AI offers bare Linux

Vast.ai excels for cost-sensitive workloads tolerating provider variability. For sustained production inference requiring 99%+ uptime, see CoreWeave's H100 cluster pricing for managed alternatives.

Cost Per Million Tokens (Inference Benchmarking)

For inference workloads, effective cost depends on model throughput:

  • 70B-parameter model on H100: ~50 tokens/second (batch size 8)
  • Assumed utilization: 75% (accounting for idle time)
  • Effective throughput: 37.5 tokens/second
  • Cost at $3.00/hr: $0.022 per 1K tokens

This compares favorably to managed inference services charging $0.05-0.15 per 1K tokens.

FAQ

How do I avoid unreliable Vast.AI providers?

Filter for >98% uptime, >100 hours historical rentals, and >4.5 star reviews. Start with small test jobs before committing multi-day workloads. Always enable checkpointing so interruptions don't cause complete job loss.

What's the difference between interruptible and on-demand on Vast.AI?

Interruptible instances (cheaper by 30-40%) can be terminated with 4-hour notice when provider reclaims GPU. On-demand fixed-price instances guarantee access until you release them. Use interruptible for resumable batch work, fixed for production serving.

Can I use Vast.AI H100 for production inference APIs?

Yes, but with caveats. Select premium providers (>99% uptime, dedicated support), use on-demand fixed pricing, and implement provider failover to secondary instance. Production-critical inference should prioritize dedicated providers like Lambda Labs.

How much can I save by bidding strategically on Vast.AI vs accepting listed prices?

Strategic bidding during off-peak hours (2-6 AM UTC) at 75% of asking price yields 60-70% acceptance rate. Savings: asking price typically $3.25/hr, bid $2.44/hr, effective savings 25% ($0.81/hr). For month-long workload: $0.81 × 730 hours = $591/month savings. More aggressive bidding (65% of asking) increases savings to 35% but reduces acceptance probability.

What's the minimum provider uptime score acceptable for production training on Vast.AI?

Minimum 96% uptime acceptable for resumable training with hourly checkpointing. At 96% uptime, expect ~29 hours downtime monthly. With checkpoint frequency, maximum loss is 1 hour of training. Below 94% uptime, expect >43 hours downtime monthly (unacceptable). Premium providers (>98% uptime) are recommended for production workloads despite 10-15% price premium.

Sources