Start Building
Coming soon — join waitlist

NVIDIA Hopper, 80 GB HBM3, SXM5

NVIDIA H100 SXMThe frontier training GPU.

The NVIDIA H100 SXM is the Hopper-generation flagship with 80 GB HBM3 memory and 3.35 TB/s of bandwidth. 3× faster than A100 on transformer workloads. FP8 Tensor Cores and NVLink 4.0 make it the standard for frontier model training and fast inference. Coming soon to packet.ai.

From $2.50/GPU-hrComing soon

Pricing to be confirmed at launch

80GB
HBM3 memory
3.35TB/s
Memory bandwidth
1979TFLOPS
FP8 compute
900GB/s
NVLink bandwidth
Architecture

Hopper HBM3 — the frontier standard.

The H100 SXM combines 80 GB HBM3 with NVLink 4.0 for the fastest multi-GPU training fabric available.

4th-gen Tensor Cores + FP8

FP8 Tensor Cores deliver 3× the throughput of A100 on transformer workloads. The current frontier training standard.

80 GB HBM3 at 3.35 TB/s

HBM3 delivers 68% more bandwidth than A100. Critical for large batch sizes and long context lengths.

NVLink 4.0 (900 GB/s)

900 GB/s NVLink for tight multi-GPU coupling. Essential for tensor-parallel training across 8+ GPUs.

Transformer Engine

Hardware-accelerated FP8 mixed precision with dynamic scaling. Native support in PyTorch and JAX.

Technical specs

NVIDIA H100 SXM specifications.

SpecificationValueGreat for
GPU architecture
NVIDIA Hopper
FP8 Tensor Cores — frontier training standard.
GPU memory
80 GB HBM3
30B at FP16 native, 70B at 4-bit.
Memory bandwidth
3.35 TB/s
68% faster than A100. Critical for large batches.
FP8 compute
1979 TFLOPS
3× A100 on transformer workloads.
NVLink
4.0 · 900 GB/s
Tight multi-GPU coupling for tensor parallelism.
MIG
Up to 7 instances
Multi-tenant inference with full isolation.
Pricing

Ways to run H100.

Dedicated or monthly — plus multi-node clusters.

DedicatedMonthly · Single-tenant
TBC /month

Reserved H100 at a flat monthly rate. Full single-tenant isolation, predictable cost exclusively for you. 99.99% SLA, zero noisy-neighbour risk.

Join waitlist →
Multi-node Cluster
From 8 GPUs

Scale frontier training across multiple H100 nodes with NVLink 4.0 and InfiniBand interconnect.

  • 8–512 GPUs per cluster
  • NVLink + InfiniBand
  • Provisioned in <1 hr
Get a wholesale quote →
Use cases

What the H100 is built for.

Frontier model training

FP8 Tensor Cores deliver 3× A100 throughput on transformer workloads.

  • FP8 mixed precision
  • NVLink 4.0 fabric
  • Transformer Engine

Fast inference

3.35 TB/s HBM3 bandwidth enables high-throughput token generation.

  • 80 GB HBM3
  • MIG for multi-tenant
  • Low p99 latency

Multi-GPU fine-tuning

NVLink 4.0 tightly couples H100s for efficient tensor-parallel training.

  • 900 GB/s NVLink
  • LoRA / QLoRA
  • DDP / FSDP
FAQ

NVIDIA H100, answered.

For anything else, reach help@packet.ai.

What is the NVIDIA H100 SXM?

Hopper flagship: 80 GB HBM3, 3.35 TB/s, 3× A100 on transformer workloads. The current frontier training standard.

When will H100 be available?

Coming soon to packet.ai. Join the waitlist to be notified at launch.

How does H100 compare to A100?

H100 SXM is ~3× faster on transformer workloads, has HBM3 vs HBM2e, supports FP8, and has NVLink 4.0 at 900 GB/s vs 600 GB/s.

What models fit in H100 SXM?

30B at FP16 natively, 70B at 4-bit. For full FP16 70B, use H200 or B200.

Does H100 support MIG?

Yes. Up to 7 isolated MIG instances for multi-tenant inference serving.

H100 SXM — coming soon.

Join the waitlist for early access to NVIDIA H100 SXM on packet.ai.

On-demand · hourly billing · US & EU regions

NVIDIA H100 SXMfrom $2.50/GPU-hr Coming soon
Join waitlist →