GPU compute, on tap.

On-demand, reserved, and spot access to NVIDIA GPU compute. On-demand, reserved, or spot — billed per GPU-second with the region multiplier applied at metering time.

Launch compute See pricing

// CAPABILITIES

Instances

Launch a GPU in a region from a template. Start, stop, resize, terminate. Cost-so-far meter.

Clusters

Multi-node NVLink / InfiniBand domains with shared volumes and chosen topology.

Managed endpoints

Deploy a vLLM or TensorRT-LLM endpoint to an HTTPS URL. Autoscale, scale-to-zero.

Storage

Persistent block volumes — attach, detach, snapshot — and S3-compatible buckets.

Networking

Private VPC, public IPs, firewall rules, and region peering.

Monitoring

GPU util, VRAM, temp, power, NVLink throughput, tokens/sec. Alert thresholds.

// AVAILABLE ON-DEMAND

// Hopper

H100 SXM

Available

The proven Hopper workhorse for inference and fine-tuning.

MEM: 80 GB HBM3
BW: 3.35 TB/s
NVLINK: Gen 4

$2.49/GPU-hrSpec →

// Hopper

H200 SXM

Available

141 GB HBM3e for long-context inference without the spill.

MEM: 141 GB HBM3e
BW: 4.8 TB/s
NVLINK: Gen 4

$3.59/GPU-hrSpec →

// Blackwell

B200

Limited

Blackwell training and inference. Supply-constrained — join the waitlist.

MEM: 192 GB HBM3e
BW: 8 TB/s
NVLINK: Gen 5

$5.89/GPU-hrSpec →

// FLAGSHIP

B300

Available

288 GB HBM3e. 15 PFLOPS dense FP4. Built for agentic AI.

MEM: 288 GB HBM3e
BW: 8 TB/s
NVLINK: Gen 5

$7.95/GPU-hrSpec →

// GET STARTED

Launch a GPU in minutes.

On-demand, reserved, or spot — billed per GPU-second with the region multiplier applied at metering time.

Launch compute Browse GPUs