Contents
- A100 Paperspace: Gradient Notebooks for Interactive ML Development
- Paperspace A100 Pricing
- Detailed Setup and Availability
- Gradient Notebook Environment
- Workload Optimization for Paperspace A100
- Comparing Paperspace A100 to Alternatives
- Dataset Management and Data Transfer
- A100 Inference on Paperspace
- FAQ
- Sources
A100 Paperspace: Gradient Notebooks for Interactive ML Development
A100 Paperspace (now DigitalOcean subsidiary) provides A100 GPU access primarily through Gradient notebooks and managed containers. Pricing starts at $3.09/hr for A100 40GB and $3.18/hr for A100 80GB, with superior availability compared to H100 offerings. A100 capacity regularly exceeds 30-50 globally available instances, making Paperspace viable for interactive development and short-term experiments.
This guide covers Paperspace's A100 offerings, Gradient environment, availability management, and when Paperspace suits ML workflows.
Paperspace A100 Pricing
Paperspace's pricing emphasizes straightforward hourly rates with optional monthly discounts.
A100 Pricing Tiers and Analysis
| Plan | Hourly | Monthly (730 hrs) | Storage | Effective Cost | Best Use |
|---|---|---|---|---|---|
| A100 40GB On-Demand | $3.09 | $2,256 | 20GB | $3.09/hr | Temporary experiments |
| A100 80GB On-Demand | $3.18 | $2,321 | 20GB | $3.18/hr | Large model workloads |
| A100 Monthly | Committed | Lower effective rate | 100GB | ~$2.60-2.80/hr | Month-long projects |
| Gradient Notebooks | $0.51/hr | $372/month | 10GB | $0.51/hr (limited GPU) | Interactive development |
Gradient notebook pricing (~$0.51/hr for A100) appears cheaper than on-demand instances, but includes significant overhead: limited GPU utilization (shared kernel execution with other notebooks) and slower I/O. Full instance pricing reflects actual single-GPU dedication.
A100 Performance on Paperspace
| Workload | Throughput |
|---|---|
| 7B Model Inference | 50-60 tokens/sec |
| 13B Model Inference | 30-40 tokens/sec |
| Training (LoRA 7B) | 380-420 tokens/sec |
Performance matches other providers (RunPod, Lambda) since underlying hardware is identical.
Detailed Setup and Availability
Launching Paperspace A100: Step-by-Step
- Access Paperspace console at https://www.paperspace.com/console
- Click "Create Notebook" or "Create Machine"
- Filter by GPU: Select "A100" (40GB or 80GB variant)
- Choose machine type: A100 (Full instance) or Gradient Notebook (shared)
- Select template: PyTorch, TensorFlow, or Custom
- Configure:
- Notebook name / Machine name
- Billing: Hourly or Monthly
- Machine: "A100 (40GB)" selected
- Click "Start" and wait 2-5 minutes for provisioning
- Access Jupyter or SSH once running
A100 Availability Patterns
Paperspace maintains approximately 30-60 A100 instances globally (versus 5-10 H100). A100 availability is significantly better:
Availability by Time
| Window | Status | Availability | Quality |
|---|---|---|---|
| Peak (9-17 UTC) | Constrained | 50-70% | Standard |
| Off-peak (18-8 UTC) | Available | 90%+ | Variable (including shared instances) |
| Weekends | Mixed | 80%+ | Improved availability |
Off-peak A100 access rarely exceeds 2-3 hour wait times. Peak hours sometimes show full capacity, requiring hourly spot checking.
Cost Optimization for Variable Availability
When A100 unavailable, Paperspace offers A6000 fallback (older GPU, 50% cheaper). Workflow:
availability_check = check_paperspace_availability()
if availability_check["a100_available"]:
select_gpu("A100") # $3.09-3.18/hr
else:
select_gpu("A6000") # $0.50-0.75/hr, 80% performance
Using A6000 as fallback reduces effective cost of Paperspace when A100 is unavailable. A6000 at ~$1.89/hr combined with A100 80GB at $3.18/hr creates a blended rate depending on availability mix.
Gradient Notebook Cost-Benefit Analysis
Gradient Notebooks at $0.51/hr appear attractive but carry hidden costs:
| Factor | Notebook Impact | Cost |
|---|---|---|
| Limited GPU time-slicing | 30-40% lower effective throughput | +$0.15-0.20/hr |
| Shared kernel overhead | Slower batch processing | +$0.10/hr |
| Storage constraints (10GB) | Frequent cache clearing | +$0.05/hr |
| Effective actual cost | ($0.51 + overhead) | $0.71-0.86/hr |
When accounting for shared overhead, Notebook's "cheap" $0.51/hr becomes ~$0.76/hr effective cost, approaching full instance pricing at $1.50-1.80/hr. Full instance is preferable for production training despite higher hourly rate:better throughput justifies cost.
Optimal Workflow: Hybrid Development and Production
Recommended workflow for cost optimization:
- Experiment phase (2-4 hours): Use Paperspace Notebook ($0.51/hr) for quick prototyping
- Validation phase (4-12 hours): Use A100 monthly plan when available
- Fallback: If A100 unavailable, switch to RunPod A100 Spot ($0.60/hr average)
- Production deployment: Use Lambda A100 reserved for 99%+ uptime
Total blended cost across workflow: ~$1.00/hr average, competitive with all other providers.
Gradient Notebook Environment
Interactive Development Interface
Paperspace's Gradient provides a Jupyter-like IDE accessible via browser with pre-installed ML libraries.
Key Features:
- Pre-installed PyTorch, TensorFlow, JAX, scikit-learn
- Terminal for custom package installation
- File browser for dataset management
- Git integration for version control
- Collaborative notebook sharing with team members
import pandas as pd
import torch
from transformers import pipeline
classifier = pipeline("text-classification",
model="distilbert-base-uncased-finetuned-sst-2-english")
results = classifier("This is a great product!")
print(results) # GPU acceleration transparent to user
Storage and Persistence
Gradient provides:
- 20GB persistent storage (hourly instances)
- 100GB persistent storage (monthly instances)
- Access to upload/download external files
- S3 integration for larger datasets
Workload Optimization for Paperspace A100
Notebook-Based Workflows
Paperspace excels for:
- Interactive model prototyping and hyperparameter tuning
- Dataset exploration and visualization
- Quick inference testing on new models
- ML research and experimentation
Avoid production deployment directly on Paperspace. Instead, develop notebooks locally, test on Paperspace, then migrate to dedicated infrastructure.
Session Management
Paperspace notebook sessions terminate after 6-hour inactivity. For longer work sessions, periodically save outputs and checkpoint state:
import time
save_interval = 900 # 15 minutes
last_save = time.time()
for epoch in range(num_epochs):
# Training code
if time.time() - last_save > save_interval:
torch.save(model.state_dict(), '/storage/checkpoint.pt')
last_save = time.time()
print(f"Checkpoint saved at epoch {epoch}")
Comparing Paperspace A100 to Alternatives
Paperspace vs RunPod
| Criteria | Paperspace | RunPod |
|---|---|---|
| Hourly Rate | $3.09-3.18 | $1.19 |
| Availability | Good | Excellent |
| Multi-GPU Clusters | No | Limited |
| Notebooks | Excellent | Community |
| Support | Chat support | Community |
RunPod costs significantly less at $1.19/hr vs Paperspace's $3.09-3.18/hr, and offers better spot pricing. Paperspace excels for interactive notebook-based development.
Paperspace vs Lambda
Lambda A100 at $1.48 on-demand costs similar to Paperspace but offers dedicated capacity and multi-GPU clusters. Choose Paperspace for interactive development; choose Lambda for production clusters.
For cost-sensitive batch workloads, see RunPod spot pricing and Vast.ai peer-to-peer marketplace. For Kubernetes-native production, check CoreWeave clusters.
Dataset Management and Data Transfer
Uploading Training Data
For datasets under 50GB, upload through Paperspace's web interface. Larger datasets require alternative approaches:
rsync -avz --progress /local/dataset/ \
username@machine.paperspace.com:/storage/dataset/
S3 Integration for Large Datasets
For datasets exceeding 100GB, store on S3 and access within notebooks:
import boto3
s3 = boto3.client('s3')
s3.download_file('my-bucket', 'large_dataset.tar.gz', '/tmp/data.tar.gz')
import smart_open
with smart_open.open('s3://my-bucket/data.csv') as f:
data = pd.read_csv(f)
A100 Inference on Paperspace
Throughput Expectations
A100 Paperspace achieves competitive inference performance:
- 7B-parameter model: 100-120 tokens/second (batch size 1)
- 13B-parameter model: 60-75 tokens/second (batch size 1)
- Batch inference (size 8): 3-4x throughput improvement
Cost per token at $3.18/hr: ~$0.0176 per token (assuming 50 tokens/second).
Model Serving Best Practices
For production inference, deploy models from Paperspace notebooks to dedicated providers:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-2-7b")
with torch.no_grad():
outputs = model.generate(**inputs, max_length=100)
torch.save(model.state_dict(), '/storage/llama-7b-final.pt')
FAQ
Should I use Paperspace A100 for production ML services?
No. Paperspace's strength is interactive development and experimentation. For production inference, migrate to RunPod ($1.19/hr) or Lambda ($1.48/hr) for guaranteed uptime SLAs.
How does Paperspace A100 monthly pricing compare to on-demand?
A100 monthly plans provide a discount over hourly on-demand rates of $3.09/hr (40GB) and $3.18/hr (80GB). Monthly plans suit projects lasting 3-4 weeks; shorter work favors on-demand. RunPod remains significantly cheaper at $1.19/hr for cost-sensitive workloads.
Can I export Paperspace notebooks and run them elsewhere?
Yes. Download notebooks as .ipynb files from Paperspace. The code is typically portable: Python/PyTorch notebooks run identically on RunPod or Lambda with minimal modification (removing Gradient-specific packages). This portability makes Paperspace excellent for development before production deployment.
When should I choose Paperspace A100 versus RunPod or Lambda?
Choose Paperspace for: (1) Interactive notebook development with immediate feedback, (2) Team collaboration (Gradient notebook sharing), (3) Quick experiments when availability good. Choose RunPod/Lambda for: (1) Production inference requiring 99%+ uptime, (2) Sustained training (12+ hours), (3) Cost-optimized batch processing. Paperspace excels for development; dedicated providers excel for production.
What cost optimization strategies apply to Paperspace A100?
(1) Use monthly plans for savings vs hourly on-demand rates ($3.09-3.18/hr), (2) Choose on-demand for <40-hour projects, (3) Use Gradient Notebooks ($0.51/hr) for development but upgrade to full instance for production training, (4) Schedule off-peak usage (0-4 UTC) for better availability and less queue time.
Sources
- Paperspace Pricing: https://www.paperspace.com/pricing
- Paperspace Gradient Documentation: https://docs.paperspace.com/gradient/
- DigitalOcean GPU Products: https://www.digitalocean.com/products/gpu/
- NVIDIA A100 Data Sheet: https://www.nvidia.com/en-us/data-center/a100/