NVIDIA Ada Lovelace 2024 flagship
NVIDIA GeForce RTX 4080 Super
// 16 GB GDDR6X · 320W TDP · 52.2 TFLOPS FP32
FLAGSHIP · RANK #4.6
▸ VRAM
16GB
▸ FP32
52.2TFL
▸ MEM BW
736GB/s
▸ TDP
320W
LLM Inference Performance
| Model | Tokens / sec | Local Fit |
|---|---|---|
| Mistral 7b Q4 | 132 tok/s | fits · single GPU |
| Llama 3 8b Q4 | 125 tok/s | fits · single GPU |
| Llama 3 13b Q4 | 70 tok/s | fits · single GPU |
| Llama 3 70b Q4 | — OOM — | OOM / offload |
Local Model Compatibility
7B params (int) fits
13B params fits
70B (4-bit quant) OOM
Spec Sheet
▸ COMPUTEA0
▸ ARCHITECTURE Ada Lovelace
▸ GPU CHIP AD103
▸ CUDA CORES 10,240
▸ TENSOR CORES 320
▸ BOOST CLOCK 2550 MHz
▸ FP32 52.2 TFLOPS
▸ LAUNCH YEAR 2024
▸ MEMORY & RATINGSB0
▸ VRAM 16 GB GDDR6X
▸ BANDWIDTH 736 GB/s
▸ TIER flagship
▸ OVERALL 4.6/5
▸ AI VALUE 4.0/5
▸ GAMING VALUE 4.5/5
▸ POWERC0
▸ TDP 320 W
▸ PERF/W (FP32) 0.163 TFL/W
▸ MODEL FITD0
▸ RUNS 7B (INT) yes
▸ RUNS 13B yes
▸ RUNS 70B (4-bit) no
▸ PLATFORM CUDA · ROCm via HIP
Comparable GPUs
Analysis notes
Quick Summary
The RTX 4080 Super AI position: nearly flagship speed without the flagship bill. 10,240 CUDA cores, 16GB GDDR6X and 736 GB/s put it close to a 4090 for 7B–13B inference, at lower power (320W) and price. The gap shows only when models need more than 16GB.
Specs That Matter for AI
16GB holds 13B models and generous context; 736 GB/s keeps generation fast. For anything that fits in 16GB, it is one of the quickest consumer cards available.
Performance
~125 tok/s on Llama 3 8B q4 and ~70 tok/s on 13B q4 — second only to the 4090 here. Strong for fine-tuning smaller models too.
Verdict
The pick when you want near-4090 inference speed and 16GB is enough. Choose the 4090 only if you need 24GB for larger models or faster training.
Frequently Asked Questions
- RTX 4080 Super or RTX 4090 for AI?
- The 4090 has 24GB and ~55% more compute, fitting larger models and training faster. The 4080 Super's 16GB and 736 GB/s make it nearly as fast for 7B–13B inference at lower power and a lower price.
- Can the RTX 4080 Super run 70B models?
- Not at q4 — 16GB is short of the ~40GB needed. It excels at 7B–13B models and fine-tuning, but 70B remains 24GB+ or multi-GPU territory.