Section C · Economics & operations

Hardware & Tiers

What GPUs you'll actually find on Vast. The marketplace spans consumer cards up through datacenter-grade, with significant quality variation within each model class. Reading the variation is half the skill of using Vast well.

The hardware range

Vast lists essentially every GPU that's profitable to rent at marketplace prices. As of 2026, the typical search shows:

  • Consumer NVIDIA: RTX 3060, 3070, 3080, 3090, 4070, 4080, 4090, 5090 (as the new gen arrives). Variable VRAM (8GB up to 32GB).
  • Workstation NVIDIA: RTX A4000, A5000, A6000, RTX 6000 Ada, RTX PRO 6000 Blackwell.
  • Datacenter NVIDIA: V100, A100 (40GB and 80GB), L40S, H100 (80GB), H200, B100/B200 in increasing quantity.
  • AMD (limited): Some MI300X instances appearing as AMD's ROCm stack matures.

No Apple Silicon, no Google TPU, no Cerebras / Graphcore exotics. The market is overwhelmingly NVIDIA CUDA. Even the AMD instances are an early experiment that hasn't reached significant scale on Vast.

Consumer cards

The consumer-card category is what makes Vast unique among GPU clouds. Hyperscalers don't sell consumer cards — they only offer datacenter-tier SKUs. Vast's supply structurally includes consumer hardware because hobbyists and small operators built fleets around it.

RTX 3090 (24GB)

The workhorse consumer card on Vast. 24GB of GDDR6X VRAM makes it usable for fine-tuning sub-13B models with quantization. Cheap ($0.20-0.30/hour). Massive supply. Most popular Vast SKU by hours rented.

RTX 4090 (24GB)

Faster than the 3090 for many workloads (notable improvements in attention-heavy operations). Same 24GB VRAM. Comes at a small price premium over the 3090. Good supply.

RTX 5090 (32GB, when listed)

NVIDIA's 50-series consumer flagship has been available since early 2025. 32GB VRAM at meaningful prices changes the math for some workloads — a single 5090 fits some models that previously required dual-card setups on 3090/4090. Still ramping on Vast as providers acquire stock.

What consumer cards can't do

They don't have NVLink (or only have weak NVLink on some 3090 configurations). They don't have ECC memory. Their PCIe-only inter-GPU communication is fine for single-card workloads but poor for multi-GPU model parallelism. They're tuned for gaming workloads, which means thermal management under sustained compute load varies by chassis.

For single-card training, fine-tuning, and inference of small-to-medium models, they're excellent value. For multi-card training of large models, look at datacenter cards.

Datacenter cards

A100 80GB

The previous-generation datacenter workhorse. 80GB HBM2e VRAM (also a 40GB variant exists). NVLink for multi-card setups. Vast's largest datacenter-card listing population, often available at $1-1.40/hour vs $4-5/hour at hyperscalers. The price/performance sweet spot for many workloads.

H100 80GB

The 2023-2025 frontier card. HBM3 VRAM. Significantly faster than A100 for transformer workloads thanks to the Transformer Engine and FP8 support. NVLink 4. Vast supply is real but thinner than A100. $2-3/hour typically.

H200

H100 with more VRAM (141GB HBM3e). Same compute throughput; bigger memory enables larger models or larger batch sizes. Came to market in 2024. Vast supply is growing.

B100 / B200 (Blackwell)

NVIDIA's 2024-2026 flagship. Significant generational leap, particularly for FP4/FP6 inference and very large models. CoreWeave and Crusoe got the first big allocations; Vast supply is sparse and growing slowly as the GPUs roll downstream.

L40S

A datacenter card oriented toward inference and graphics workloads. 48GB VRAM. Different value proposition from H100 — cheaper, less raw FP16 throughput, more for inference and visual computing. Decent Vast supply.

Multi-GPU instances

Vast lists multi-GPU instances when providers configure them. The typical multi-GPU listing is 2x, 4x, or 8x of the same card model in a single machine.

What to know:

  • Inter-GPU bandwidth varies. A "4x H100" instance might have NVLink (great for multi-GPU training) or might be 4 cards on PCIe with no NVLink (much weaker for multi-GPU training).
  • Multi-node training is rare. Vast instances are single-host. If you need 16+ GPUs across multiple machines with high-bandwidth interconnect (InfiniBand), Vast can't deliver. Use CoreWeave or Crusoe.
  • Memory-constrained workloads benefit most. 8x H100 with 640GB total VRAM lets you train models that don't fit on a single card. The compute throughput scales sub-linearly for many workloads but the memory unlocks new workload classes.

Variation within a model

The hardest thing to internalize about Vast is that two listings labeled "H100" can perform very differently. Reasons:

  • Host system. The CPU, RAM, motherboard, and PCIe topology around the GPU matter. A GPU in a system with PCIe x4 connection to the CPU performs worse for memory-intensive workloads than the same GPU at x16.
  • Thermal headroom. A GPU in a well-cooled chassis runs at its boost clocks; one in a hot environment thermally throttles down. Sustained workload performance can differ by 10-20%.
  • Driver and CUDA stack. Providers run different driver versions. Cutting-edge model code might require a specific minimum CUDA version.
  • Networking. Some hosts have 10 Gbps; others have 100 Mbps residential connections. Bandwidth-heavy workloads (large dataset downloads, distributed) need to filter on this.
  • Storage. NVMe vs SATA SSD vs spinning disk. Matters for data loading.

Vast's DLPerf score is supposed to capture some of this variation. It's an imperfect signal but useful — a 30% gap in DLPerf between two listings of the same nominal card is real.

Tier distribution

By rough count of available listings at any given moment (illustrative):

TierTypical share of supplyUse case
Consumer (3090/4090)~50%Indie ML, fine-tuning, inference
Workstation (A6000, etc.)~10%Larger fine-tunes, prosumer workloads
Datacenter (A100, H100)~35%Serious training, batch inference
Bleeding edge (H200, B200)~5%Frontier workloads

The shape will continue to shift as new generations arrive. The mid-tier consumer cards will remain the supply backbone because that's where Vast's competitive advantage is strongest.

How to select

A practical filter strategy for Vast's search UI:

  1. GPU model. Start with what you need. Don't over-spec — an A100 is plenty for most non-frontier workloads.
  2. VRAM minimum. Your model's memory footprint plus training optimizer state plus activations.
  3. DLPerf minimum. Set a floor to filter out underperforming listings.
  4. Reliability score minimum. 95%+ for production-ish work, lower thresholds OK for resumable batch.
  5. Bandwidth. 1 Gbps+ for normal workloads; higher if you're moving large datasets.
  6. Inet ping / region. Lower latency matters if you're running interactive workloads or driving inference from a specific region.
  7. Price ceiling. Your budget per GPU-hour.
  8. On-demand vs interruptible. Pick based on workload checkpoint-ability.

Save the search; the marketplace updates continuously and you may want to re-query when prices shift.

Takeaway

Vast's hardware range is broader than any other GPU cloud — from $0.15/hour gaming cards to bleeding-edge B200 instances. The variation within each tier is real and matters. The best Vast users learn to read the signals (DLPerf, reliability, bandwidth) and pick listings that match their workload's specific needs, not just headline GPU model.

The next chapter goes deep on trust, verification, and networking — the systems that make heterogeneous supply usable.