Skip to content
// FLEET
us-west-1GB300 · liquid
eu-central-1B300 · liquid
apac-sg-1GB200 · NVL72
me-uae-1VR200 · Rubin
// DOCS

Instances

An instance is one or more GPUs of a single SKU in a region, launched from an image.

Lifecycle

States progress pending → provisioning → running → stopping → stopped → terminated. You can start, stop, reboot, resize, and terminate a running instance. State changes are driven by the Control Plane and surfaced over webhooks with a polling fallback.

Pricing tiers

  • On-demand — per-GPU-hour, launch and terminate at will.
  • Reserved — 1/6/12-month commit, discounted.
  • Spot — interruptible, cheapest; expect a short preemption notice.

Telemetry

Each running instance streams GPU utilization, VRAM, temperature, power draw, NVLink throughput, and tokens/sec. Set alert thresholds to get notified on anomalies.

Spend caps

Set a per-org spend cap with a soft alert and a hard stop. When the cap is reached, new launches are paused automatically.