Section B · Operations

Product Surface

Pods, serverless endpoints, persistent volumes, templates, the CLI, the API. RunPod's developer-experience-led identity shows up most concretely here.

Pods

A "pod" in RunPod terminology is a running container with attached GPU(s). The fundamental product unit. A pod is created with:

  • A container image (RunPod-provided templates or your own).
  • A GPU type and count.
  • A region / cloud-type (Community or Secure).
  • Optionally, attached persistent volumes for storage that survives pod termination.

Pods boot in seconds-to-minutes depending on the template and image. The pod model is friendlier than raw VM provisioning — there's no OS installation, no driver configuration, no PyTorch setup. Users go from "I need a GPU" to "I have a running Jupyter notebook" in under five minutes for most workflows.

Serverless

RunPod Serverless is a managed inference / function service. Customers deploy a model or a function; RunPod runs it on demand with sub-minute cold starts and per-request billing.

Use cases:

  • Inference APIs that need to scale to zero between requests.
  • Workloads where you don't want to keep a pod running 24/7.
  • Spiky traffic patterns where dedicated capacity would be wasteful.

This is the product that competes most directly with Together.AI / Fireworks / Modal / Replicate. The user experience is similar — deploy a function, get an HTTP endpoint, pay per call.

Serverless is also where RunPod's two-product synergy shows: the underlying compute can run on Community Cloud (cheap, variable) or Secure Cloud (reliable). Users mostly don't see the distinction; RunPod's orchestrator handles routing.

Persistent volumes

By default, pod local storage disappears when the pod is terminated. Persistent volumes (network-attached, persistent block storage) survive pod lifecycle and attach to subsequent pods.

Useful for:

  • Training datasets you want to reuse across runs.
  • Model weights that exceed reasonable image sizes.
  • Checkpoints that need to outlive a single pod.

The pricing is separate from compute — typically a few cents per GB per month. The performance is network-attached storage, not local NVMe, so for performance-critical I/O you want to copy data to local storage at job start.

Templates

Pre-built container images for common workloads:

  • PyTorch + Jupyter for general ML.
  • Stable Diffusion variants (Automatic1111, ComfyUI, etc.) for image generation.
  • Text-generation web UIs for LLM experimentation.
  • vLLM for inference deployment.
  • Specific model-server templates (Llama, Mistral, etc.).

Templates lower the barrier to entry. A user who couldn't set up CUDA + drivers + their framework themselves can still spin up a working pod from a template.

CLI & API

RunPod's CLI and HTTP API let users automate everything the web UI does:

import runpod

runpod.api_key = "..."

pod = runpod.create_pod(
    name="training-run-42",
    image_name="runpod/pytorch:latest",
    gpu_type_id="NVIDIA RTX 4090",
    cloud_type="COMMUNITY",
    volume_in_gb=100,
)
# pod.id, pod.public_ip, etc.

The CLI mirrors the API. Power users orchestrate dozens of pods programmatically; SDKs in Python and JavaScript exist; integrations with workflow tools (Prefect, Dagster, etc.) are community-driven.

Networking

Pod networking varies by cloud type:

  • Community Cloud: Public IP, port forwarding to specific services. Variable bandwidth.
  • Secure Cloud: Public IP optional. Inter-pod networking within a region available. More consistent bandwidth.

Multi-pod distributed training: possible but constrained. Secure Cloud supports it within a single region with reasonable bandwidth. Community Cloud generally doesn't — pods are on different providers' hardware with no high-speed interconnect.

Developer experience

The DX is the single biggest reason customers choose RunPod over Vast. Concretely:

  • Faster onboarding (template-driven).
  • More polished web UI.
  • Cleaner CLI / API surface.
  • Better documentation.
  • More active community and company-staffed support channels.
  • Serverless option that Vast doesn't have.

The trade-off: prices are typically a bit higher than Vast for the same nominal hardware, and the supply pool is smaller. Users who prioritize DX over absolute lowest price come to RunPod; users who prioritize price over polish often pick Vast.

Takeaway

RunPod's product surface is meaningfully richer than a pure marketplace — pods, serverless, volumes, templates, plus a polished web/API surface. That richness is what justifies the price premium over Vast and is the primary moat against marketplace competitors. The next chapter goes into pricing in detail.