Instances

An instance is one or more GPUs of a single SKU in a region, launched from an image.

Lifecycle

States progress pending → provisioning → running → stopping → stopped → terminated. You can start, stop, reboot, resize, and terminate a running instance. State changes are driven by the Control Plane and surfaced over webhooks with a polling fallback.

Pricing tiers

On-demand — per-GPU-hour, launch and terminate at will.
Reserved — 1/6/12-month commit, discounted.
Spot — interruptible, cheapest; expect a short preemption notice.

Telemetry

Each running instance streams GPU utilization, VRAM, temperature, power draw, NVLink throughput, and tokens/sec. Set alert thresholds to get notified on anomalies.

Spend caps

Set a per-org spend cap with a soft alert and a hard stop. When the cap is reached, new launches are paused automatically.

← Previous

Monitor usage & resources

Segal Deploy