Vendor pricing - Last verified July 2026

Modal GPU pricing 2026

Serverless GPU platform built on per-second billing and Python-native deployment primitives. Designed for inference and batch jobs rather than long-lived training clusters.

Pricing model

per-second

Cheapest published rate

$0.59 per GPU-hour

T4 hourly equiv

Public source

https://modal.com/pricing

Last verified July 2026

Published rate card

Per-GPU per-hour rates pulled from the vendor pricing page linked above, Last verified July 2026. Rates exclude storage, egress, and any managed-service uplift. Reserved-capacity contracts typically improve on these rates.

GPU	Configuration	Per GPU per hour
H100	Per-second, hourly equiv	$3.950
A100 80GB	Per-second, hourly equiv	$2.500
A100 40GB	Per-second, hourly equiv	$2.100
L40S	Per-second, hourly equiv	$1.950
A10G	Per-second, hourly equiv	$1.100
T4	Per-second, hourly equiv	$0.590

Hidden costs to watch

Per-second billing is GPU-time only; cold-start container build time is billed on CPU at a separate rate.
Outbound egress, persistent volumes, and dictionaries metered on top of GPU time.
Team plan is required to remove the free-tier rate-limits.

What Modal is best for

Serverless inference, batch jobs, and ephemeral training where pay-per-second on cold start matters.

See the GPU cloud buying guide

Worked example

Acme Vision Co. (illustrative example, not a real company) needs to train a 10 billion-parameter vision-language model on a fixed 8-GPU cluster for 30 days at 18 hours per day duty cycle.

At Modal's cheapest published rate of $0.59 per GPU-hour, the run costs $2,549 for raw GPU compute, before storage, egress, and MLOps overhead. Add the typical 25 percent year-one uplift and the modelled spend is $3,186. Use the calculator on the homepage to model your own GPU class, cluster size, and duty cycle.

Last verified July 2026. Modal rates change frequently. Always obtain a vendor quote before purchase.

Visit Modal pricing page