Pricing & Bidding
The mechanical heart of the marketplace. Two rental modes, a real-time bid mechanism, market-driven pricing that responds to GPU generation cycles and supply availability hour-by-hour.
On-demand vs interruptible
Every Vast instance is offered in one of two modes:
On-demand
The client pays the listed price; the instance runs until the client stops it (or hardware fails). The host commits to not interrupting it.
- Higher price.
- Predictable — you keep the instance.
- The right mode for: training runs you don't want interrupted, production-ish inference, anything where restart cost is high.
Interruptible (also called spot / preemptible)
The client places a bid. The provider runs the highest-bidding job at any given moment. If a higher bidder appears, the current job is preempted (killed) and the higher bidder takes the instance.
- Often 2-5x cheaper than on-demand for the same hardware.
- Unpredictable — your job might get killed.
- The right mode for: workloads with checkpointing, big batch jobs that can resume, anything where you can absorb a kill.
The bid system is what makes Vast's pricing genuinely market-driven. Demand surges raise the equilibrium bid; supply gluts lower it. Prices move fast.
The bid mechanic
The mechanic is a continuous auction within a single instance. The provider sets a minimum bid (the floor). Clients submit bids that exceed the floor. Higher bids win.
If you're a client on an interruptible instance and you've bid $1.50/hour, and another client bids $1.80/hour:
- The provider's system preempts your job.
- Your job receives a termination signal (typically 30-60 seconds advance notice, depending on configuration).
- The provider's instance switches to running the higher-bidder's container.
- You can re-bid higher if you want to reclaim the instance.
This is how providers price-discover: by accepting whatever the market will pay, with the floor as a safety net to prevent giveaway prices. Competition among clients for the most attractive instances drives prices up; competition among providers for clients drives prices down.
In practice, the equilibrium bid often clusters within a tight band — say, 1.5-2x the floor — because once a job starts paying that, marginal new bidders see they need to pay similar to take the instance.
What pricing looks like
Approximate prices (illustrative, varies by hour, region, and provider):
| Card | Vast on-demand | Vast interruptible | AWS/Azure on-demand (list) | Vast discount |
|---|---|---|---|---|
| RTX 3090 | $0.20-0.30/hr | $0.10-0.15/hr | N/A (not offered) | — |
| RTX 4090 | $0.30-0.45/hr | $0.15-0.25/hr | N/A (not offered) | — |
| A100 80GB | $1.00-1.40/hr | $0.40-0.70/hr | $4-5/hr | ~70% |
| H100 80GB | $2.00-3.00/hr | $1.20-2.00/hr | $8-12/hr | ~70% |
| H200 | $2.50-3.50/hr | $1.50-2.50/hr | $10-14/hr | ~70% |
Three observations:
- Consumer cards (3090, 4090) don't have hyperscaler equivalents at all. Vast is the only major source.
- Datacenter cards (A100, H100, H200) on Vast are consistently 60-80% off list hyperscaler pricing. Reserved hyperscaler pricing narrows that gap somewhat (to 40-60% off) but rarely closes it.
- Pricing relative to other marketplaces (RunPod, TensorDock) is closer — within 10-30% — because marketplaces share the same supply-side cost basis.
Why Vast is cheap
Three reasons explain the 60-80% gap to hyperscalers:
1. Lower cost basis
A small operator buying H100s in 2023 might have paid $25-30k each. A hyperscaler is buying in volume at lower per-unit prices but also has datacenter cap-ex, networking infrastructure, redundancy systems, and significant labor overhead. Net cost-per-GPU-hour at a small operator is competitive with — sometimes lower than — a hyperscaler's.
2. Lower margin requirements
Hyperscaler GPU pricing reflects strategic margins on top of cost. The price isn't trying to be the minimum the cloud could charge; it's trying to maximize willingness-to-pay from a customer base with limited alternatives. Vast's marketplace doesn't allow that — competition among providers drives margins toward the marginal cost of operation.
3. No bundled services
Hyperscaler GPU prices include the implicit price of integration with their ecosystem (S3, networking, IAM, identity, compliance, support). Vast doesn't bundle any of that, so the price is closer to the raw compute cost.
The gap is structural, not temporary. It will narrow somewhat as hyperscalers respond to competition (Azure has reduced some GPU pricing in 2025), but the fundamental cost-of-operation gap is real and durable.
Price volatility
Vast prices move. Some patterns:
- New GPU generation launches. When the H100 launched, A100 prices on Vast dropped meaningfully as providers shifted to selling the newer cards. When the B200 ramps, expect H100 prices to do the same.
- Demand surges. Big model releases (open-source Llama / Mixtral / DeepSeek releases) drive bursts of fine-tuning demand. H100 prices spike.
- Seasonal patterns. Conference deadlines (NeurIPS / ICML / ICLR) drive academic demand surges.
- Crypto market interactions. When crypto prices rise and certain GPU-friendly coins are profitable to mine, supply on Vast shrinks as providers redirect hardware. Prices rise.
- Reserved-deal news. When CoreWeave or Crusoe announces a big multi-year deal, the residual public market tightens — some demand that would have gone to those clouds flows to Vast.
For a buyer, the volatility is a feature: you can shop opportunistically and find good deals. For a planner trying to budget six months out, the volatility is a bug: it's hard to forecast Vast's contribution to monthly spend.
Buyer strategy
Practical strategies clients use to get the best deal:
- Use interruptible for resumable workloads. If your training loop checkpoints every N steps, you can absorb preemptions. Run interruptible at 50% the cost of on-demand.
- Filter on DLPerf score, not just price. The cheapest instance with mediocre DLPerf often costs more than the moderately-priced instance with high DLPerf, because the slower instance takes longer.
- Bid above the floor by a healthy margin. Setting your bid at the floor means you're first to be preempted. Pad by 20-50% if you want to reduce churn.
- Pre-build your Docker image. Image pull time eats your billed minutes. Bake the image and push it to a registry; instance launches in seconds, not minutes.
- Watch the market for a few days before committing to longer runs. Prices fluctuate; you can sometimes catch dips.
- Diversify across providers. Don't rely on a single host for your work. If you have multiple short jobs, run them on different providers to amortize risk.
Provider strategy
From the provider side:
- Price competitively at first to build a reputation. New hosts have no rating; pricing slightly below market gets first renters and starts the rating cycle.
- Set the interruptible floor carefully. Too low and you'll attract bidders who barely cover your costs; too high and you lose to lower-priced competitors.
- Maintain reliability. Hosts with high uptime command price premiums. The investment in UPS, monitoring, and proactive maintenance pays back in higher hourly rates.
- Monitor DLPerf and tune. Poor benchmarks reflect thermal throttling, PCIe issues, or other fixable problems. Diagnose and fix them.
- Diversify renter base. Don't lock to long-term reserved deals if you can; the spot market often pays better in periods of high demand.
Takeaway
Vast's pricing is real market pricing. The bid mechanic and the on-demand/interruptible split give the platform a degree of price discovery that's unusual in compute markets. The 60-80% discount to hyperscalers is structural, not promotional, and is the single biggest reason customers come.
The next chapter covers the hardware tiers in detail — which GPUs you'll actually find on Vast and the quality variation across them.