Question 1

Which NVIDIA GPUs are available, and which ship the same day?

Accepted Answer

NVIDIA B200, H200, A100, RTX 6000 Pro, RTX 5090, L40S, and H100 SXM are available on-demand. Single-GPU and small fleets (under 8 GPUs) are typically provisioned in under 5 minutes. Larger clusters and InfiniBand-connected nodes take 1–2 weeks depending on region.

Question 2

How fast can I actually get a GPU running?

Accepted Answer

From signup to SSH on a live B200 or H200 in under 5 minutes via the dashboard, CLI, or REST API. No manual approvals, no waitlists. You'll land in an Ubuntu 22.04 environment with CUDA, drivers, PyTorch, vLLM, and Jupyter pre-installed.

Question 3

Do you support multi-node training with InfiniBand?

Accepted Answer

Yes. Multi-node clusters with NVIDIA Quantum-2 InfiniBand (400 Gb/s) or RoCEv2 fabric are available for distributed training of large models. NCCL, MPI, and Slurm support out of the box. Available in US East/West and EU regions; sizing from 4 to 128+ nodes.

Question 4

What does “Token Factory” give me that vLLM-on-a-VM doesn't?

Accepted Answer

Token Factory is the managed inference layer — OpenAI-compatible endpoint, auto-scaling, model registry, per-token billing, KV-cache reuse, and request batching tuned for the GPU under the hood. Just change your base_url to api.packet.ai/v1. For maximum control, run vLLM yourself on a dedicated B200 instead — both work.

Question 5

Does my data persist between sessions?

Accepted Answer

Yes. Persistent storage survives reboots and instance stops. You only pay storage cost while a pod is stopped — not GPU cost. Local NVMe for hot data; shared volumes for datasets across instances; S3-compatible object storage for archives.

Question 6

Is there a free tier or trial?

Accepted Answer

Yes — $10 in free credits on signup, no credit card required. Enough for about 5 hours on an H200 or 15 hours on RTX 6000 Pro. Add wallet funds when you want to keep going; refund the rest when you're done.

Question 7

What support do you offer in production?

Accepted Answer

24/7 human-staffed support. P1 incidents get a response in under 15 minutes. Enterprise tier includes Slack Connect with our SRE team, dedicated account engineer, and on-call escalation. No tiered ticket queues; you talk to people who know CUDA.

Question 8

Are you SOC 2 / GDPR / HIPAA aligned?

Accepted Answer

SOC 2 Type II report available under NDA from security@packet.ai. GDPR-compliant with a published DPA and Standard Contractual Clauses. HIPAA-eligible workloads via dedicated single-tenant clusters — contact us to scope a BAA.

Everything to ship AI products.

Latest-generation GPUs, on-demand.

SSH, web terminal, or one-click deploy.

Real-time GPU metrics.

Persistent storage. Pre-installed toolchains. Everything you need to ship fast.

Transparent, fair pricing.

Enterprise-grade infrastructure.

Real humans. Fast response.

Questions about the platform.

Start building. No card required.