Coming soon — join waitlist

NVIDIA Hopper, 80 GB HBM3, SXM5

NVIDIA H100 SXMThe frontier training GPU.

The NVIDIA H100 SXM is the Hopper-generation flagship with 80 GB HBM3 memory and 3.35 TB/s of bandwidth. 3× faster than A100 on transformer workloads. FP8 Tensor Cores and NVLink 4.0 make it the standard for frontier model training and fast inference. Coming soon to packet.ai.

From $2.50/GPU-hrComing soon

Pricing to be confirmed at launch

Join waitlist →See pricing

80GB

HBM3 memory

3.35TB/s

Memory bandwidth

1979TFLOPS

FP8 compute

900GB/s

NVLink bandwidth

Architecture

Hopper HBM3 — the frontier standard.

The H100 SXM combines 80 GB HBM3 with NVLink 4.0 for the fastest multi-GPU training fabric available.

4th-gen Tensor Cores + FP8

FP8 Tensor Cores deliver 3× the throughput of A100 on transformer workloads. The current frontier training standard.

80 GB HBM3 at 3.35 TB/s

HBM3 delivers 68% more bandwidth than A100. Critical for large batch sizes and long context lengths.

NVLink 4.0 (900 GB/s)

900 GB/s NVLink for tight multi-GPU coupling. Essential for tensor-parallel training across 8+ GPUs.

Transformer Engine

Hardware-accelerated FP8 mixed precision with dynamic scaling. Native support in PyTorch and JAX.

Technical specs

NVIDIA H100 SXM specifications.

SpecificationValueGreat for

GPU architecture

NVIDIA Hopper

FP8 Tensor Cores — frontier training standard.

GPU memory

80 GB HBM3

30B at FP16 native, 70B at 4-bit.

Memory bandwidth

3.35 TB/s

68% faster than A100. Critical for large batches.

FP8 compute

1979 TFLOPS

3× A100 on transformer workloads.

NVLink

4.0 · 900 GB/s

Tight multi-GPU coupling for tensor parallelism.

MIG

Up to 7 instances

Multi-tenant inference with full isolation.

Pricing

Ways to run H100.

Dedicated or monthly — plus multi-node clusters.

Coming soon

DedicatedHourly · Single-tenant

$2.50 /GPU-hr

Full H100 SXM reserved exclusively for you. Zero noisy-neighbour risk, 99.99% SLA.

Join waitlist →

DedicatedMonthly · Single-tenant

TBC /month

Reserved H100 at a flat monthly rate. Full single-tenant isolation, predictable cost exclusively for you. 99.99% SLA, zero noisy-neighbour risk.

Join waitlist →

Multi-node Cluster

From 8 GPUs

Scale frontier training across multiple H100 nodes with NVLink 4.0 and InfiniBand interconnect.

8–512 GPUs per cluster
NVLink + InfiniBand
Provisioned in <1 hr

Get a wholesale quote →

Use cases

What the H100 is built for.

Frontier model training

FP8 Tensor Cores deliver 3× A100 throughput on transformer workloads.

FP8 mixed precision
NVLink 4.0 fabric
Transformer Engine

Fast inference

3.35 TB/s HBM3 bandwidth enables high-throughput token generation.

80 GB HBM3
MIG for multi-tenant
Low p99 latency

Multi-GPU fine-tuning

NVLink 4.0 tightly couples H100s for efficient tensor-parallel training.

900 GB/s NVLink
LoRA / QLoRA
DDP / FSDP

FAQ

NVIDIA H100, answered.

For anything else, reach help@packet.ai.

What is the NVIDIA H100 SXM?

Hopper flagship: 80 GB HBM3, 3.35 TB/s, 3× A100 on transformer workloads. The current frontier training standard.

When will H100 be available?

Coming soon to packet.ai. Join the waitlist to be notified at launch.

How does H100 compare to A100?

H100 SXM is ~3× faster on transformer workloads, has HBM3 vs HBM2e, supports FP8, and has NVLink 4.0 at 900 GB/s vs 600 GB/s.

What models fit in H100 SXM?

30B at FP16 natively, 70B at 4-bit. For full FP16 70B, use H200 or B200.

Does H100 support MIG?

Yes. Up to 7 isolated MIG instances for multi-tenant inference serving.

H100 SXM — coming soon.

Join the waitlist for early access to NVIDIA H100 SXM on packet.ai.

Join waitlist →Talk to a human

On-demand · hourly billing · US & EU regions

NVIDIA H100 SXMfrom $2.50/GPU-hr Coming soon

Join waitlist →