Contents
GPU Cloud Price Tracker
Track GPU prices weekly. Spot pricing moves 10-30% weekly. Missing that means missed savings opportunities.
Weekly is best: daily is noise, monthly misses trends.
This guide shows the tracking template and metrics that matter.
Price Tracking Template
Create a spreadsheet tracking these elements weekly:
Header Row:
- Date (week starting)
- Provider
- GPU Model
- Memory Size
- Spot Price ($/hour)
- On-Demand Price ($/hour)
- 4-Week Average
- 4-Week Trend (up/down/stable)
- Availability (% uptime estimate)
- Notes
Example Data (as of March 2026):
| Date | Provider | GPU | Memory | Spot | On-Demand | 4-Week Avg | Trend | Notes |
|---|---|---|---|---|---|---|---|---|
| 3/22/26 | RunPod | A100 SXM | 80GB | $0.59 | $1.39 | $1.10 | stable | Consistent pricing |
| 3/22/26 | Lambda | A100 PCIe | 40GB | $1.48 | $2.10 | $1.50 | up | Slight increase week-over-week |
| 3/22/26 | RunPod | H100 SXM | 80GB | $2.69 | $3.65 | $2.72 | stable | High demand, stable rate |
| 3/22/26 | CoreWeave | A100 SXM | 80GB | $1.33 | $1.75 | $1.35 | down | Competitive pricing emerging |
Core Metrics to Monitor
Absolute Price: The raw cost per hour. Track both spot and on-demand separately since they target different use cases.
Cost Per Million Tokens: For inference workloads, this derived metric matters more than raw GPU cost. Calculate by: (hourly price ÷ 3600) × tokens generated per second on that GPU, then extrapolate to 1M tokens. This enables apples-to-apples comparison across GPU types and providers.
Availability: Track promised uptime and actual availability. A GPU 40% cheaper but only available 60% of the time has worse effective pricing than a reliable option.
Price Trend Direction: Moving averages reveal whether prices are rising, falling, or stable. Use 4-week and 12-week moving averages. A GPU trending downward might become significantly cheaper in weeks if the trend continues.
Capacity Premium: Track whether certain providers charge premiums for guaranteed capacity during peak hours versus off-peak pricing. Some providers offer 15-30% discounts for off-peak usage.
Tracking Across Providers
Systematically monitor all major providers offering the target GPUs:
- RunPod GPU pricing - Primary marketplace provider
- Lambda GPU pricing - Enterprise-focused offering
- CoreWeave GPU pricing - Infrastructure-native approach
- AWS GPU pricing - Production cloud integration
- Google Cloud GPU pricing - Alternative production option
- Vast AI pricing - Peer-to-peer marketplace
- NVIDIA reference pricing - Baseline comparison
Compare prices within GPU families (A100, H100, H200, B200) and across families. Price ratios reveal misalignments. If an H100 costs only 20% more than an A100, the H100 likely offers better value due to 2x performance.
Implementing Automated Tracking
Manual spreadsheet tracking works for 5-10 GPU/provider combinations. Beyond that, automation becomes essential.
API-Based Approach:
Most providers offer public pricing APIs. Write a simple Python script querying provider APIs weekly:
import requests
import csv
from datetime import datetime
providers = {
'runpod': 'https://api.runpod.io/pricing',
'lambda': 'https://cloud.lambdalabs.com/api/v1/instance-types',
'coreweave': 'https://api.coreweave.com/pricing'
}
def fetch_prices():
data = []
for provider, endpoint in providers.items():
response = requests.get(endpoint)
prices = response.json()
data.append({
'date': datetime.now().isoformat(),
'provider': provider,
'prices': prices
})
return data
def store_historical_data(data):
with open('gpu_prices_history.csv', 'a') as f:
writer = csv.writer(f)
writer.writerows(data)
if __name__ == '__main__':
prices = fetch_prices()
store_historical_data(prices)
Schedule this script to run weekly via cron or cloud scheduler. Accumulate data in a database or CSV for trend analysis.
Spreadsheet Automation:
Google Sheets supports IMPORTDATA() functions that fetch pricing from web endpoints returning CSV or TSV. Some providers publish pricing in this format. Combine with QUERY() functions to filter and analyze.
Visualization Approach:
Graph price trends monthly. Use tools like Tableau, Looker, or even simple Excel charts. Visual trends reveal patterns faster than numerical tables.
Identifying Cost-Saving Opportunities
Analyze the tracking data for:
Provider Arbitrage: If teams can run workloads on multiple GPUs, does moving from A100 to H100 or H200 actually save money? Track derived costs (per token, per training step) rather than raw hourly rates.
Time-Shifted Execution: Some providers offer significant off-peak discounts. If the workload has flexibility, shifting batch jobs to off-peak windows (weekends, nights in other time zones) yields 20-40% savings.
Commitment Discounts: Providers typically offer 20-30% discounts for annual commitments. Only commit after tracking prices for 3-4 weeks. Committing to high prices wastes the discount benefit.
Capacity Planning: If prices trend downward consistently, temporary over-provisioning while waiting for better rates sometimes costs less than spot instance interruptions and rescheduling overhead.
Pricing Anomalies to Investigate
Flag unusual patterns:
- Sudden price spikes: Usually indicate high demand or supply constraints. Temporary spikes rarely warrant changing providers; sustained increases do.
- Diverging provider prices: If a GPU costs 50% more on one provider than others, investigate capacity, SLA differences, or regional variation.
- Memory variants with unusual pricing: Sometimes 80GB and 40GB versions have unexpected price gaps. Understand if the larger memory justifies cost.
Tracking Multiple Workload Types
Different workloads show different cost sensitivities:
Inference (cost-per-token critical): Track normalized token costs across providers. Test with the actual models; theoretical maximum throughput rarely materializes.
Training (throughput and memory critical): Monitor total cost per training run (hardware × hours). Batch size, learning rate schedules, and hardware all impact total time, not just hourly cost.
Development/Experimentation: Track free tier utilization and per-minute costs during testing. Identify whether free tiers justify staying on specific platforms.
Relevant Pricing Context
To understand pricing ratios, review LLM API pricing comparison for inference platform alternatives. Compare against OpenAI API pricing and Anthropic API pricing to understand whether managed services or self-hosted inference makes sense financially.
For hardware context, understanding NVIDIA H100 pricing and NVIDIA A100 pricing helps establish baseline expectations for GPU rental costs.
Building Forecasting Models
After 12 weeks of tracking, analyze seasonality and trends:
- Do prices dip at specific times (month-end, quarter-end)?
- Is there weekly variation (weekday vs. weekend)?
- Do specific events (new GPU release, hyperscaler announcements) cause price shifts?
Simple exponential smoothing or moving average models help forecast future prices with 70-80% accuracy for stable providers.
Seasonal Patterns (March 2026 observations):
Q1 typically shows stability. Q2 sees gradual increases. Q3 shows peak pricing (summer crypto season). Q4 shows volatility (holiday computing spike, new GPU launches).
Plan major infrastructure decisions in Q1-Q2 when prices are relatively stable. Avoid multi-month commitments during Q3-Q4.
Provider-Specific Patterns:
RunPod: Prices relatively stable, adjusting monthly based on hardware costs CoreWeave: More aggressive price competition, weekly micro-adjustments Lambda: Committed pricing with less variability Vast AI: Marketplace volatility, 5-15% daily variations normal
Understanding provider behavior helps predict when prices might stabilize.
Advanced Pricing Insights
Hardware Deprecation Curves:
New GPUs (B200, H200) command 50% premiums initially. As new generations release, prior-generation prices drop 20-30%. Timing infrastructure purchases around this cycle yields significant savings.
Availability as Pricing Signal:
When GPU availability is low (<50% of providers have stock), prices increase 10-20%. When abundant, prices drop 5-15%. Monitor availability to predict price movement.
Bulk Pricing Transparency:
Providers show list pricing but reserve bulk discounts for negotiation. Understanding what others pay helps establish negotiating baseline. A 15-20% discount is typical minimum for $100K+ annual commitments.
Time-of-Day Pricing Patterns:
Some providers show hourly variation within day (peak US business hours 15-20% premium). Early morning or overseas hours see 10-15% discounts. Scheduling batch work accordingly saves 10-15%.
Dashboard and Alert Setup
Google Sheets Automation:
Using IMPORTDATA() and QUERY() functions, create live pricing dashboard:
=IMPORTDATA("provider_api_endpoint_url")
Add conditional formatting to highlight price changes exceeding thresholds (e.g., red if price increases >5%, green if decreases >5%).
Slack Alerts:
Integrate with Slack using webhooks. Alert when:
- Specific GPU pricing increases >10%
- Specific GPU availability drops <30%
- Provider announces price changes
Automated alerts prevent missing important price movements.
Email Summaries:
Weekly digest email with:
- Price changes from prior week
- Ranked providers by cost
- 4-week trend analysis
- Recommended actions (commit now, wait, switch providers)
Digest format ensures visibility without constant monitoring.
Comparative Analysis Frameworks
Cost Per Operation Tracking:
Different GPUs have different efficiencies. Track cost per specific operation:
Cost per million tokens (inference): ($hourly_rate ÷ 3600) × 1M ÷ (tokens_per_second)
This normalizes across GPU types and enables true apples-to-apples comparison.
Total Cost of Ownership:
Include non-hourly costs:
- Data transfer ($0.01-0.30 per GB egress)
- Setup/configuration time ($50-500 per deployment)
- Monitoring overhead ($100-500 monthly for reliable production operation)
- Switching costs (if considering provider change)
Real cost often 10-20% higher than quoted GPU hourly rates.
Risk-Adjusted Cost:
Factor in interruption probability and financial impact of downtime.
Effective cost = (hourly rate) + (interruption_probability × cost_of_downtime_per_interruption)
For critical workloads, premium providers' higher rates are justified by lower interruption costs.
FAQ
How often should I actually update my price tracker?
Weekly is optimal for most teams. Daily updates introduce noise. Monthly updates miss short-term optimization opportunities. For very large deployments spending $100K+ monthly, twice-weekly updates might be justified.
Should I track every GPU model or just focus on the ones I use?
Start with the 3-4 GPU models matching your workload requirements. Once the tracking system stabilizes, expand to related models. Tracking A100, H100, and H200 together reveals whether A100 premiums make H100 attractive.
What if a provider changes pricing mid-week?
Use the published price at weekly check-in time. If major shifts happen mid-week, note them in the "Notes" column for context. Providers typically coordinate pricing changes monthly, not continuously.
How do I handle regional price variation?
Track region separately in your template. Some providers charge 20-30% premiums for regions with lower capacity. If you can run workloads in multiple regions, this variation becomes critical for cost optimization.
Should I include reserved instance pricing in my tracking?
Yes, but separately from spot pricing. Calculate the effective hourly cost (total commitment ÷ hours committed) and track both numbers. Reserved instances should be cheaper per hour but lack flexibility.
Related Resources
- GPU Pricing Guide - Comprehensive pricing dashboard
- LLM API Pricing Comparison - Inference platform pricing context
- Best GPU Cloud in 2026 - Ranked provider comparison
- CoreWeave GPU Pricing - Provider-specific tracking
- Vast AI GPU Pricing - Marketplace pricing patterns
Sources
- DeployBase.AI GPU pricing database (as of March 2026)
- RunPod, Lambda Labs, CoreWeave official pricing (as of March 2026)
- AWS, Google Cloud, NVIDIA pricing documentation (as of March 2026)
- Infrastructure cost tracking case studies from 2026
- Community tools and monitoring platforms