Pricing models - Last verified July 2026

How GPU cloud vendors price compute in 2026

Four pricing models dominate GPU cloud in 2026. Most vendors offer two or three of them. The decision is rarely about which model is cheapest in the abstract; it is about which one matches your workload shape.

Pricing model

On-demand per-hour

The default for hyperscalers and most specialist clouds. The customer pays the published rate from the moment the instance is in a Running state until it is Stopped or Terminated. Rates vary by GPU class, region, and instance topology.

Upside

No commitment. Capacity provisioned on demand. Easy to model.

Trade-off

Headline list rate is the highest tier the vendor offers. No discount for utilisation. Some H100 SKUs are intermittently unavailable on pure on-demand.

Used by: CoreWeave, Lambda, Crusoe, AWS, Azure, GCP, Hyperstack, Oracle, DigitalOcean, Paperspace

Pricing model

Per-second (serverless)

Billing accrues per-second of active GPU time. Optimised for inference and short-burst training where instance lifetime is measured in seconds or minutes. Container cold-start and warm-pool time are billed separately or rolled in.

Upside

Pay nothing when no requests are in flight. Strong fit for spiky inference. No reservation needed.

Trade-off

Per-second hourly-equivalent rates are typically a premium over the same GPU on-demand. Cold-start billing surprises low-traffic deployments.

Used by: Modal, Replicate, RunPod Serverless

Pricing model

Marketplace

Independent hosts list capacity at a price they choose; buyers bid or accept the floor. Vast.ai operates the marketplace and a settlement layer.

Upside

Floor rates are consistently the lowest in the category (Vast.ai H100 floor $1.49 per GPU-hour, June 2026).

Trade-off

Reliability and network speed vary by host. Interruptible instances can be reclaimed. No SLA tier.

Used by: Vast.ai

Pricing model

Reserved capacity

The customer commits to a defined cluster (often 8x to 1,024x H100 or H200) for 1 to 36 months in exchange for a meaningful discount on the on-demand rate. The most common procurement shape for production training clusters.

Upside

Headline rate is 30 to 60 percent below the same vendor's on-demand. Capacity is guaranteed for the term.

Trade-off

You pay the reservation whether you use it or not. Cluster size is fixed; scaling up means a new contract. Exit fees can apply mid-term.

Used by: CoreWeave, Lambda Reserved Cloud, Together AI, Crusoe, DigitalOcean (12-mo), AWS Capacity Blocks for ML, Azure Reserved Instances, Hyperstack

Last verified July 2026.