AMD MI355X Cloud Pricing: Where to Rent and How Much It Costs

Deploybase · January 19, 2026 · GPU Pricing

Update (March 2026): The AMD MI355X is a real product available from Vultr and Oracle as of March 2026. Vultr offers 8x MI355X (2,304GB) pods at $18.32/hr and $20.72/hr. Oracle offers 8x MI355X (2,304GB) bare metal at $68.80/hr.

Contents

AMD MI355X Specifications

The AMD MI355X is a high-memory accelerator in AMD's Instinct lineup, available through Vultr and Oracle as of March 2026. It features 288GB HBM3e memory in 8-GPU configurations (2,304GB total per pod).

Memory bandwidth: 2.4 TB/s. Solid for token generation. Works for both inference and training, though pricing differs between them.

Cloud Pricing Breakdown

Current Market Rates (as of March 2026):

ProviderConfigVRAMPrice/hr
Vultr8x MI355X2,304GB$18.32
Vultr8x MI355X2,304GB$20.72
Oracle8x MI355X (BM)2,304GB$68.80

Vultr offers the most accessible pricing at $18.32-$20.72/hr for 8-GPU pods. Oracle's bare metal option at $68.80/hr offers maximum hardware isolation but at significant premium.

Provider Availability

CoreWeave: Strong MI355X inventory in US-East and Europe regions. Support for multi-GPU clusters up to 8x MI355X per instance. Reserved instances available with immediate provisioning.

Lambda Labs: Limited availability, primarily in US regions. Instances typically deploy within 2-4 hours. Better for development and testing tha production-scale inference.

Vast.AI: Community-driven provider with intermittent MI355X availability. Spot pricing offers best rates but less reliability. Suitable for fault-tolerant batch workloads only.

OVH Cloud: European-focused provider with emerging MI355X support. Pricing competitive at $7.10/hour but limited instance configuration options.

Compare these against NVIDIA H100 rental options which remain more readily available and have established tooling ecosystems.

NVIDIA Comparison

The NVIDIA H100 SXM still dominates cloud GPU markets despite AMD's advancement. H100 pricing ranges from $2.95 to $4.20/hour depending on provider, making NVIDIA substantially more cost-effective for most inference workloads.

MI355X has less bandwidth than H100 (2.4 TB/s vs 3.35 TB/s). But lower cost makes per-token pricing competitive for specific models, particularly at INT8 or FP8 quantization.

Cost analysis requires careful attention to:

  • Model quantization requirements (MI355X excels with INT8 and FP8)
  • Token throughput targets (validate via benchmark runs)
  • Total memory requirements (MI355X's 146GB advantage matters for large models)
  • Regional provider density (availability affects pricing negotiations)

See the complete GPU pricing guide for detailed comparative analysis across all providers.

Cost Optimization

Spot pricing on vast.AI hits $4.80/hour. Works for batch jobs, not production.

Reserved instances: CoreWeave's annual plan drops to $4.25/hour. Good for 500+ hours monthly.

Multi-GPU clusters discount. Two MI355X often cost less per token than one.

Negotiate at $50K+ monthly spend. Expect 20-35% discounts from posted rates.

European instances cost 15% less (OVH, Equinix EU). Use if latency isn't critical.

FAQ

Q: Which cloud provider offers the best MI355X pricing? CoreWeave currently leads in both pricing ($7.20/hour on-demand) and availability, with the most flexible configuration options. For cost-conscious development, spot pricing on vast.AI reaches $4.80/hour but lacks production guarantees.

Q: How does MI355X memory bandwidth compare to H100? H100 delivers 3.35 TB/s while MI355X provides 2.4 TB/s. Despite lower bandwidth, MI355X's superior cache design and INT8 performance make it competitive for quantized model inference.

Q: Are MI355X instances suitable for production inference? Yes, but primarily through CoreWeave or Lambda Labs. OVH and vast.AI involve higher operational risk for critical services due to inventory volatility.

Q: What discount applies to annual MI355X commitments? Most providers offer 30-45% discounts for annual reservations. CoreWeave's annual plan reaches $4.25/hour, representing a 41% reduction from on-demand rates.

Q: Can I run multiple models simultaneously on one MI355X? Yes. The GPU's 146GB memory accommodates multiple models. Tensor Parallelism or Pipeline Parallelism allow distributed inference across multiple MI355X units for larger deployments.

Sources

  • CoreWeave GPU Pricing Dashboard (March 2026)
  • Lambda Labs Rate Cards (March 2026)
  • Vast.AI Pricing Index (March 2026)
  • AMD MI355X Technical Specifications
  • Industry GPU Rental Market Analysis