Pricing

Pay for compute.
Not commitments.

Q: What is the difference between Dynamic and Dedicated?

Dedicated gives you the whole GPU card, exclusively yours — single-tenant, predictable p99, ideal for production inference and regulated workloads. Dynamic is shared infrastructure with the same peak performance and VRAM; the packet.ai scheduler co-locates workloads that stress different GPU dimensions so latency stays within ±2–5% of dedicated, at roughly half the price.

Q: Do you offer monthly billing?

Yes. Monthly commits are up to 20% off the on-demand hourly rate. 6-month and 12-month terms include additional discounts (talk to sales). All commits include rollover credits for unused hours.

Q: How fast can I deploy a GPU?

Dynamic-tier launches in under 5 minutes from API call to SSH-ready. Dedicated single-GPU launches in 5–10 minutes. Multi-node Cluster deployments take 2–6 weeks for InfiniBand cabling and validation.

Q: Are there hidden fees? Ingress, egress, storage?

No platform fees, no ingress. Egress is $0.04/GB (vs $0.09/GB on AWS). NVMe local storage included up to 2 TB per node. Object storage is $0.018/GB-month hot and $0.004/GB-month cold (S3-compatible).

Q: Which NVIDIA GPUs are available?

Published today: RTX 5090, RTX 6000 Pro, L40S, A100 80GB, and B200. H100, H200, and B300 are available on request. New SKUs roll in monthly.

Q: Which regions do you serve?

Capacity is live in the United States and Europe — California, Virginia, Texas, Oregon, Frankfurt, Amsterdam, Paris, London, Dublin. APAC capacity is rolling out in Q3.

Three plans, seven NVIDIA SKUs, one transparent rate card. Same peak performance & VRAM as Dedicated, often at half the price.

Dynamic· Shared, same peak performance

Multi-tenant GPU with scheduler-enforced isolation. Same peak performance & VRAM as Dedicated. Hourly billing.

Dedicated· Whole card, exclusively yours

Single-tenant GPU committed to your account. Zero scheduler interference. Predictable performance. 99.99% SLA.

Clusters· Multi-node, flexible terms

Multiple nodes at wholesale pricing, around 30% below retail. Custom storage, named TAM, dedicated SLAs.

Billing

GPU

Region

VRAM

Any Showing 8 of 8 GPUs

NVIDIA L40S

48GB GDDR6Ada Lovelace

US East · Virginia

DynamicLaunching soon Notify me Dedicated$0.92/hr Deploy Dedicated ClustersCustomGet a wholesale quote

NVIDIA RTX 5090

32GB GDDR7Blackwell

US West · Oregon

DynamicLaunching soon Notify me DedicatedLaunching soon Notify me ClustersCustomGet a wholesale quote

NVIDIA RTX 4090

24GB GDDR6XAda Lovelace

US Central · Texas

DynamicLaunching soon Notify me Dedicated$0.39/hr Deploy Dedicated ClustersCustomGet a wholesale quote

NVIDIA RTX 6000 Pro

96GB GDDR7BlackwellPopular

US West · California

Dynamic$0.66/hr Deploy Dynamic DedicatedLaunching soon Notify me ClustersCustomGet a wholesale quote

NVIDIA A100 80GB

80GB HBM2eAmpere

EU · Frankfurt

DynamicLaunching soon Notify me Dedicated$1.43/hr Deploy Dedicated ClustersCustomGet a wholesale quote

NVIDIA H100 SXM

80GB HBM3HopperComing soon

US East · Virginia

DynamicLaunching soon Notify me DedicatedLaunching soon Notify me ClustersCustomGet a wholesale quote

NVIDIA H200

141GB HBM3eHopperComing soon

US East · Virginia

DynamicLaunching soon Notify me DedicatedLaunching soon Notify me ClustersCustomGet a wholesale quote

NVIDIA B200

192GB HBM3eBlackwellNew

US East · Virginia

Dynamic$3.75/hr Join waitlist Dedicated$5.25/hr Join waitlist ClustersCustomGet a wholesale quote

Neocloud comparison

Same silicon.
Lower bill.

Side-by-side starting rates on the same NVIDIA silicon. packet.ai delivers the same peak performance and VRAM through smart scheduling, often at a fraction of what other neoclouds charge.

GPU

packet.ai
starts from

RunPod
starts from

Vast.ai
starts from

Lambda Labs
starts from

ShadeFarm
starts from

RTX 4090

24 GB GDDR6X

$0.39~30% avg

$0.69

$0.35

$0.60

RTX 5090

32 GB GDDR7

$0.99

$0.41

$0.65

RTX 6000 Pro

96 GB GDDR7

$2.09

$1.00

$2.19

L40S

48 GB GDDR6

$0.92~9% avg

$0.86

$0.47

$1.70

A100 80GB

80 GB HBM2e

$1.43~23% avg

$1.49

$0.77

$1.99

$3.28

B200

192 GB HBM3e

$5.25~11% avg

$5.89

$4.34

$6.99

$6.52

All prices in USD per GPU-hour, published starting rates. / = SKU not offered by that provider.

Which plan should you choose?

Match the plan to the workload.

Mix and match per project. Most customers run Dynamic for dev and Dedicated for production on the same account, then move to Clusters when they outgrow single-node training.

Workload

Recommended

Why

Production inference (LLM API)

Dedicated

Predictable p99 latency · zero scheduler interference

Fine-tuning under 100 GPU-hours

Dynamic

Hourly billing · spin up in under 5 min · no commit

Evaluation / batch inference

Dynamic

Bursty workload, Dynamic is 30-40% cheaper

Distributed training · 64+ GPUs

Clusters

Requires InfiniBand fabric · multi-node NVLink topology

Frontier model pre-training

Clusters

1,000+ GPU pool · custom storage tiering · named TAM

Always-on RAG / agent compute

Dedicated

Single-tenant card · committed monthly rate

Research notebooks · interactive dev

Dynamic

Hourly billing · spin down between sessions

Pricing FAQ

Questions teams ask before signing.

Real answers from our solutions team. For anything not here, reach help@packet.ai.

What is the difference between Dynamic and Dedicated?

Dedicated gives you the whole GPU card, exclusively yours. Single-tenant, predictable p99, ideal for production inference and regulated workloads. Dynamic is shared infrastructure with the same peak performance and VRAM; the packet.ai scheduler co-locates workloads that stress different GPU dimensions so latency stays within ±2-5% of dedicated, at roughly half the price.

How does packet.ai compare to RunPod, Vast.ai, Lambda Labs, and ShadeFarm?

On the same NVIDIA SKU, packet.ai starting rates are competitive with or below every major neocloud. An NVIDIA A100 80 GB is $1.43/GPU-hr on packet.ai vs $0.77-$1.99/GPU-hr across the others. The B200 lands at $3.75 vs $4.34-$6.99. See the comparison table above for the full breakdown.

Do you offer monthly billing?

Yes. Monthly commits are up to 20% off the on-demand hourly rate. 6-month and 12-month terms include additional discounts (talk to sales). All commits include rollover credits for unused hours.

How fast can I deploy a GPU?

Dynamic-tier launches in under 5 minutes from API call to SSH-ready. Dedicated single-GPU launches in 5-10 minutes. Multi-node Cluster deployments take 2-6 weeks for InfiniBand cabling and validation.

What about cluster pricing?

Clusters use InfiniBand-connected multi-node fabric (up to 1,024+ GPUs). Pricing depends on fabric topology, node count, storage tier, and term. Typical 64-node B200 clusters land at $2.80-$3.20/GPU-hr on 12-month terms. Talk to clusters team.

Are there hidden fees? Ingress, egress, storage?

No platform fees, no ingress. Egress is $0.04/GB (vs $0.09/GB on AWS). NVMe local storage included up to 2 TB per node. Object storage is $0.018/GB-month hot and $0.004/GB-month cold (S3-compatible).

Which NVIDIA GPUs are available?

Published today: RTX 4090, RTX 5090, RTX 6000 Pro, L40S, A100 80GB, and B200. H100 SXM and H200 are launching soon — join the notify list. New SKUs roll in monthly.

Which regions do you serve?

Capacity is live in the United States and Europe, California, Virginia, Texas, Oregon, Frankfurt, Amsterdam, Paris, London, Dublin. Use the region search above the rate card to filter. APAC capacity is rolling out in Q3.

Launch in under 5 minutes.
Or talk to a human.

Most teams ship their first inference workload before their AWS quote comes back.

Start building →Get a wholesale quote

No credit card · usage-based billing · cancel anytime

Pay for compute. Not commitments.

Same silicon.Lower bill.

Match the plan to the workload.

Questions teams ask before signing.

Launch in under 5 minutes.Or talk to a human.

Pay for compute.
Not commitments.

Same silicon.
Lower bill.

Launch in under 5 minutes.
Or talk to a human.