Dynamic GPU Cloud

Full GPU performance.
Half the price.

Q: Is Dynamic actually as fast as Dedicated?

For the vast majority of workloads, yes. p99 latency stays within ±2–5% of a dedicated card on the same SKU. For strict benchmarks or regulated production, we recommend Dedicated.

Q: How is my workload isolated from others?

Hardware-level memory and compute partitioning, enforced by the scheduler. Tenants never share address space, and the scheduler co-locates only workloads that stress different GPU dimensions.

Q: What does it cost?

Dynamic starts at $0.54/GPU-hr (L40S) and runs to $3.75/GPU-hr (B200). Monthly commits save up to 20%. Full rate card

Q: How fast can I get a GPU?

Under 5 minutes from API call to SSH-ready, with CUDA and drivers preinstalled.

Q: Can I move to a Dedicated card later?

Yes — workloads hot-migrate from Dynamic to Dedicated without a reboot, typically under 100 ms.

Dynamic gives you the same peak performance and VRAM as a dedicated card. Through smart, scheduler-enforced multi-tenancy. Spin up in under 5 minutes, pay by the hour, scale down anytime.

Start building See pricing

No credit card to start · launch in under 5 min · hourly billing

SCHEDULER · us-west-2 91% utilized

inference88%

fine-tune72%

training94%

eval61%

4 tenants · 1 GPU · zero contention

±2–5%

latency vs. dedicated

<5 min

signup to SSH

50–65%

cheaper than hyperscalers

99.9%

monthly uptime

Available silicon

Shared GPUs, full-card performance.

Every Dynamic GPU delivers the same peak performance and VRAM as its dedicated counterpart. You only pay for the cycles your workload uses.

L40S

48GB GDDR6Ada Lovelace

Launching soon

Notify me

RTX 6000 Pro

96GB GDDR7Blackwell

from$0.66/GPU-hr

Deploy

A100 80GB

80GB HBM2eAmpere

Launching soon

Notify me

B200

New

192GB HBM3eBlackwell

from$3.75/GPU-hr

Join waitlist

RTX 5090

32GB GDDR7Blackwell

Launching soon

Notify me

H100 SXM

80GB HBM3Hopper

Launching soon

Notify me

H200

141GB HBM3eHopper

Launching soon

Notify me

RTX 4090

24GB GDDR6XAda Lovelace

fromLaunching soon

Notify me

Hourly rates shown · monthly commits save up to 20% · See full pricing →

How it works

From terminal to training in three steps.

Pick a GPU

Choose any SKU from the live rate card — L40S to B200. No quota requests, no sales calls.

Launch in under 5 min

API, CLI, or one-click. CUDA preinstalled, SSH-ready, persistent storage attached.

Scale & scale down

Burst to more GPUs when you need them, spin down when you don't. Billing stops the moment you do.

Why Dynamic

The performance of dedicated, the economics of shared.

Same peak performance

The scheduler co-locates workloads that stress different GPU dimensions, so yours never contends. p99 latency stays within ±2–5% of a dedicated card.

Scheduler-enforced isolation

Hardware-level memory and compute partitioning. Your data and your model never touch another tenant.

Hot migration

Workloads move between hosts with no reboot — typically under 100 ms. You never notice a re-placement.

Hourly billing

Pay by the hour with per-second metering under the hood. No minimums, no platform fee, no egress surprise.

Developer-grade tooling

API, CLI, web terminal, and SSH. CUDA, drivers, and common frameworks preinstalled on every image.

US & EU regions

Capacity across California, Virginia, Texas, Oregon, Frankfurt, Amsterdam, Paris, London, and Dublin.

Built for

Made for iterative AI work.

Fine-tuning

Short, bursty runs that don't justify a reserved card.

Spin up per job
Stop billing instantly
30–40% cheaper than dedicated

Evaluation & batch

Schedulable, parallelizable inference at scale.

Fan out across GPUs
No idle reservation cost
OpenAI-compatible endpoints

Agents & dev

Interactive notebooks and always-on agent loops.

Instant launch
Persistent volumes
Hot-migrate to Dedicated later

FAQ

Dynamic GPU, answered.

For anything not here, reach help@packet.ai.

Is Dynamic actually as fast as Dedicated?

For the vast majority of workloads, yes. p99 latency stays within ±2–5% of a dedicated card on the same SKU. For strict benchmarks or regulated production, we recommend Dedicated.

How is my workload isolated from others?

Hardware-level memory and compute partitioning, enforced by the scheduler. Tenants never share address space, and the scheduler co-locates only workloads that stress different GPU dimensions.

What does it cost?

Dynamic starts at $0.54/GPU-hr (L40S) and runs to $3.75/GPU-hr (B200). Monthly commits save up to 20%. Full rate card →

How fast can I get a GPU?

Under 5 minutes from API call to SSH-ready, with CUDA and drivers preinstalled.

Can I move to a Dedicated card later?

Yes — workloads hot-migrate from Dynamic to Dedicated without a reboot, typically under 100 ms.

Launch a GPU
in under 5 minutes.

Most teams ship their first inference workload before their AWS quote comes back.

Start building →See pricing

No credit card · hourly billing · cancel anytime

Full GPU performance.Half the price.

Shared GPUs, full-card performance.

L40S

RTX 6000 Pro

A100 80GB

B200

RTX 5090

H100 SXM

H200

RTX 4090

From terminal to training in three steps.

Pick a GPU

Launch in under 5 min

Scale & scale down

The performance of dedicated, the economics of shared.

Same peak performance

Scheduler-enforced isolation

Hot migration

Hourly billing

Developer-grade tooling

US & EU regions

Made for iterative AI work.

Fine-tuning

Evaluation & batch

Agents & dev

Dynamic GPU, answered.

Launch a GPUin under 5 minutes.

Full GPU performance.
Half the price.

Launch a GPU
in under 5 minutes.