GPU use cases

Run any AI workload.
On the right GPU.

Q: What can I run on packet.ai GPUs?

Effectively any GPU-accelerated workload: LLM training and inference, image and video generation, speech and audio, fine-tuning, reinforcement learning, embeddings and RAG, AI agents, 3D rendering, scientific simulation, and HPC. CUDA, drivers, and common frameworks (PyTorch, JAX, TensorFlow) are preinstalled on every image.

Q: Which product should I use for inference vs. training?

Use Dynamic for bursty, schedulable work — fine-tuning, batch inference, notebooks, and agents. Use Dedicated for production APIs that need a 99.99% SLA and predictable p99 latency. Use Clusters for multi-node distributed training across hundreds of interconnected GPUs.

Q: What GPUs are available?

NVIDIA RTX 5090, L40S, RTX 6000 Pro, A100 80GB, and B200, with H200, H100 SXM, and B300 available for clusters. Dynamic starts at $0.54/GPU-hr and Dedicated at $0.59/GPU-hr.

Q: How fast can I get a GPU running?

Under five minutes from signup to an SSH-ready GPU on Dynamic, with CUDA and drivers preinstalled. Dedicated single GPUs provision in 5–10 minutes; multi-node clusters take 2–6 weeks to cable and validate.

Q: Can I serve an OpenAI-compatible API?

Yes. packet.ai exposes OpenAI-compatible inference endpoints, so you can point existing SDKs and tooling at your own deployment with minimal changes.

Q: Is packet.ai suitable for regulated or production workloads?

Yes. Dedicated GPUs are single-tenant with a 99.99% uptime SLA backed by service credits, plus a DPA, audit support, and EU data residency for compliance-sensitive workloads.

From a single fine-tuning run to a 1,024-GPU training fabric, packet.ai matches every workload to the right NVIDIA silicon — shared, dedicated, or clustered. Launch in under five minutes, pay by the hour, scale anytime.

Explore use cases See pricing

17+

supported workloads

<5 min

signup to SSH

NVIDIA GPU types

99.99%

uptime SLA, dedicated

Inference & serving

Serve models to production traffic.

Low-latency, OpenAI-compatible inference on the latest NVIDIA silicon — scale from a single GPU to thousands without re-architecting.

LLM text generation

Serve Llama, Mistral, Qwen, and DeepSeek with vLLM or TGI behind an OpenAI-compatible endpoint. Token streaming, continuous batching, and autoscaling built in.

DynamicA100 · B200

Explore Dynamic

Image & video generation

Run Stable Diffusion, SDXL, Flux, and video-diffusion pipelines with fast cold starts and persistent model caches.

DynamicL40S · RTX 6000 Pro

Explore Dynamic

Speech & audio

Whisper transcription, speaker diarization, and low-latency text-to-speech for real-time voice products.

DynamicL40S

Explore Dynamic

Embeddings & RAG

High-throughput embedding generation and vector search to power retrieval-augmented generation at scale.

DynamicL40S · A100

Explore Dynamic

Production APIs with SLA

Customer-facing inference where p99 latency matters — single-tenant cards with a 99.99% uptime SLA you can resell.

DedicatedA100 · B200

Explore Dedicated

Training & fine-tuning

From a LoRA run to frontier pre-training.

Bursty fine-tunes on shared GPUs, or thousands of interconnected cards for multi-week training runs — same platform, same tooling.

Fine-tuning & LoRA

Short, bursty runs that don’t justify a reserved card. Spin up per job, stop billing the moment it finishes.

DynamicA100 · B200

Explore Dynamic

Distributed pre-training

Multi-week runs across hundreds of B200s on non-blocking InfiniBand with 5th-gen NVLink and NVSwitch.

ClustersB200 · B300

Explore Clusters

ML frameworks

PyTorch, JAX, and TensorFlow preinstalled with CUDA and drivers. Bring your own container or start from ours.

DynamicAll GPUs

Explore Dynamic

Batch & data processing

Schedulable, parallelizable jobs — synthetic data, evaluation harnesses, and large-scale ETL across many GPUs.

DynamicL40S · A100

Explore Dynamic

RL & RLHF

Reinforcement learning and preference-tuning pipelines with reservable capacity for long campaigns.

DedicatedA100 · B200

Explore Dedicated

Agents & development

Build, test, and run agentic systems.

Interactive notebooks, always-on agent loops, and GPU-native dev environments — launched in under five minutes.

AI agents

Always-on agent loops with persistent volumes and instant launch. Hot-migrate to a dedicated card as you scale.

DynamicL40S · A100

Explore Dynamic

Notebooks & interactive dev

Jupyter, remote VS Code, SSH, and a web terminal on a real GPU in minutes — no quota requests, no sales calls.

DynamicL40S · RTX 6000 Pro

Explore Dynamic

GPU programming

CUDA, Triton, and kernel development on bare-metal-class hardware with local NVMe scratch and full-card access.

DedicatedRTX 5090 · A100

Explore Dedicated

Virtual computing

GPU-backed virtual workstations for teams that need on-demand compute without managing physical hardware.

DynamicL40S

Explore Dynamic

Graphics & rendering

Pixels, simulation, and visual compute.

Render farms, real-time streaming, and scientific simulation on workstation-class and data-center GPUs.

Graphics rendering

Offline 3D rendering and VFX with Blender, Octane, and Unreal Engine on RTX-class GPUs at wholesale pricing.

DedicatedRTX 6000 Pro

Explore Dedicated

Scientific simulation

CFD, molecular dynamics, and HPC workloads that need sustained, predictable full-card throughput.

DedicatedA100 · B200

Explore Dedicated

Game & stream

Low-latency game streaming and real-time rendering pipelines backed by 100 Gbps dedicated networking.

DedicatedRTX 5090 · L40S

Explore Dedicated

Match your workload

Three ways to get a GPU.

Every workload maps to one of three products. Here's how teams choose.

Dynamicfrom $0.54/GPU-hr

Shared GPUs, full-card performance.

Best for

Fine-tuning & LoRA
Batch inference & eval
Notebooks & agents
Image / video generation

Best when work is bursty and you want to pay only for the cycles you use.

Explore Dynamic

Dedicatedfrom $0.59/GPU-hr

A whole card, exclusively yours.

Best for

Production APIs with SLA
Regulated workloads
Sustained training
Rendering & simulation

Best when you need predictable p99 latency, a 99.99% SLA, or compliance isolation.

Explore Dedicated

Clusters~30% below retail

Multi-node GPU, wholesale pricing.

Best for

Frontier pre-training
Distributed fine-tuning
Reserved capacity
1,024+ GPU fabrics

Best for multi-week runs across hundreds of interconnected GPUs on InfiniBand.

Explore Clusters

By industry

Teams shipping on packet.ai.

AI startups

Ship your first inference workload before an AWS quote comes back. Hourly billing, no minimums, and a clear path from dev to production.

Healthcare & life sciences

Single-tenant GPUs, a signed DPA, and EU data residency for medical imaging, genomics, and protein-folding workloads.

Financial services

Isolated, audit-ready infrastructure for risk modeling, fraud detection, and document intelligence under compliance constraints.

Media & entertainment

Render farms and generative-media pipelines on RTX-class GPUs — scale up for a deadline, scale down the next day.

Robotics & autonomy

Train perception and planning models, then run batch simulation across many GPUs with topology-aware scheduling.

Research & academia

Reserve frontier silicon for a known program or season at wholesale rates, with a named technical account manager.

FAQ

GPU use cases, answered.

For anything not here, reach help@packet.ai.

Explore more: Dynamic GPU, Dedicated GPU, GPU Clusters, Token Factory, and Pixel Factory.

What can I run on packet.ai GPUs?

Effectively any GPU-accelerated workload: LLM training and inference, image and video generation, speech and audio, fine-tuning, reinforcement learning, embeddings and RAG, AI agents, 3D rendering, scientific simulation, and HPC. CUDA, drivers, and common frameworks (PyTorch, JAX, TensorFlow) are preinstalled on every image.

Which product should I use for inference vs. training?

Use Dynamic for bursty, schedulable work — fine-tuning, batch inference, notebooks, and agents. Use Dedicated for production APIs that need a 99.99% SLA and predictable p99 latency. Use Clusters for multi-node distributed training across hundreds of interconnected GPUs.

What GPUs are available?

NVIDIA RTX 5090, L40S, RTX 6000 Pro, A100 80GB, and B200, with H200, H100 SXM, and B300 available for clusters. Dynamic starts at $0.54/GPU-hr and Dedicated at $0.59/GPU-hr.

How fast can I get a GPU running?

Under five minutes from signup to an SSH-ready GPU on Dynamic, with CUDA and drivers preinstalled. Dedicated single GPUs provision in 5–10 minutes; multi-node clusters take 2–6 weeks to cable and validate.

Can I serve an OpenAI-compatible API?

Yes. packet.ai exposes OpenAI-compatible inference endpoints, so you can point existing SDKs and tooling at your own deployment with minimal changes.

Is packet.ai suitable for regulated or production workloads?

Yes. Dedicated GPUs are single-tenant with a 99.99% uptime SLA backed by service credits, plus a DPA, audit support, and EU data residency for compliance-sensitive workloads.

Find your GPU.
Ship today.

Pick the workload, pick the product, and launch in under five minutes.

Start building → See pricing

No credit card to start · hourly billing · US & EU regions

Run any AI workload.On the right GPU.

Serve models to production traffic.

LLM text generation

Image & video generation

Speech & audio

Embeddings & RAG

Production APIs with SLA

From a LoRA run to frontier pre-training.

Fine-tuning & LoRA

Distributed pre-training

ML frameworks

Batch & data processing

RL & RLHF

Build, test, and run agentic systems.

AI agents

Notebooks & interactive dev

GPU programming

Virtual computing

Pixels, simulation, and visual compute.

Graphics rendering

Scientific simulation

Game & stream

Three ways to get a GPU.

Shared GPUs, full-card performance.

A whole card, exclusively yours.

Multi-node GPU, wholesale pricing.

Teams shipping on packet.ai.

AI startups

Healthcare & life sciences

Financial services

Media & entertainment

Robotics & autonomy

Research & academia

GPU use cases, answered.

Find your GPU.Ship today.

Run any AI workload.
On the right GPU.

Find your GPU.
Ship today.