How much VRAM does the AMD Radeon RX 7900 XT have?

The AMD Radeon RX 7900 XT has 20 GB of GDDR6 with 800 GB/s memory bandwidth.

What is the AMD Radeon RX 7900 XT best for?

With 20 GB of VRAM, the AMD Radeon RX 7900 XT handles smaller models (7B–14B) at Q4–Q5 quantization — ideal for entry-level local LLM experimentation and lightweight inference.

What LLMs can the AMD Radeon RX 7900 XT run locally?

The AMD Radeon RX 7900 XT can run 43 of the 80 open-weight models tracked by CanItRun natively in VRAM at 8k context. Top options include: Llama 3.1 8B Instruct at Q8_0, Llama 3.2 3B Instruct at FP32, Llama 3.2 1B Instruct at FP32.

Can the AMD Radeon RX 7900 XT run Llama 3.3 70B Instruct?

The AMD Radeon RX 7900 XT can run Llama 3.3 70B Instruct with CPU offload at Q3_K_M quantization, but inference will be slower than native VRAM execution.

Can the AMD Radeon RX 7900 XT run Qwen 3.6 27B?

Yes. The AMD Radeon RX 7900 XT runs Qwen 3.6 27B natively in VRAM at Q3_K_M quantization, achieving approximately 35.6 tokens per second.

Can the AMD Radeon RX 7900 XT run Llama 3.1 8B Instruct?

Yes. The AMD Radeon RX 7900 XT runs Llama 3.1 8B Instruct natively in VRAM at Q8_0 quantization, achieving approximately 54.3 tokens per second.

AMD Radeon RX 7900 XT

The AMD Radeon RX 7900 XT has 20 GB VRAM and 800 GB/s memory bandwidth. It can run 43 of our 80 tracked models natively in VRAM at 8k context.

With 20 GB GDDR6, the AMD Radeon RX 7900 XT is a consumer-tier GPU that can run 43 models natively. It handles 30B-class models at Q4 quantization.

The AMD Radeon RX 7900 XT features 20GB GDDR6 at 800 GB/s on the RDNA 3 architecture. The unique 20GB capacity puts it between 16GB and 24GB tiers — comfortably holding 14B models at Q5 and some 27B models at aggressive Q3–Q4 quantization. A compelling mid-range option for AMD users on Linux with ROCm support.

AMD Radeon RX 7900 XT: 20GB GDDR6 at 800 GB/s — RDNA 3 mid-range.

7B-14B at Q4 native. 27B tight at Q4. ~7-12 t/s for 7B.

ROCm Linux support. Vulkan on Windows. 20GB is awkward capacity.

Vendor	AMD
Architecture	RDNA 3
VRAM	20 GB
Memory type	GDDR6
Memory bandwidth	800 GB/s
Compute backend	ROCM
Tier	Consumer
Released	2022
Models (native)	43 / 80
Models (offload)	6 / 80

Software: ROCm is Linux-only; on Windows use the Vulkan backend instead. Requires llama.cpp compiled with ROCm support.

Popular models for this GPU

Qwen 3.5 35B-A3B (MoE)Qwen 3.6 35B Yi 1.5 34B Chat Qwen3 32B Qwen 2.5 32B Instruct

Models this GPU runs natively in VRAM (43)

Models that fit with CPU offload (6)

These use system RAM for layers that don't fit in VRAM — expect much slower inference.

Too large for this GPU (31)

Continue reading

vram-guides9 min

Best LLMs for 16 GB VRAM GPUs (2026)

hardware9 min

Best GPUs $500-$1000 for LLMs (2026)

hardware10 min

AMD Radeon for LLMs: ROCm & Vulkan Complete Guide

Frequently asked questions

How much VRAM does the AMD Radeon RX 7900 XT have?: The AMD Radeon RX 7900 XT has 20 GB of GDDR6 with 800 GB/s memory bandwidth.
What is the AMD Radeon RX 7900 XT best for?: With 20 GB of VRAM, the AMD Radeon RX 7900 XT handles smaller models (7B–14B) at Q4–Q5 quantization — ideal for entry-level local LLM experimentation and lightweight inference.
What LLMs can the AMD Radeon RX 7900 XT run locally?: The AMD Radeon RX 7900 XT can run 43 of the 80 open-weight models tracked by CanItRun natively in VRAM at 8k context. Top options include: Llama 3.1 8B Instruct at Q8_0, Llama 3.2 3B Instruct at FP32, Llama 3.2 1B Instruct at FP32.
Can the AMD Radeon RX 7900 XT run Llama 3.3 70B Instruct?: The AMD Radeon RX 7900 XT can run Llama 3.3 70B Instruct with CPU offload at Q3_K_M quantization, but inference will be slower than native VRAM execution.
Can the AMD Radeon RX 7900 XT run Qwen 3.6 27B?: Yes. The AMD Radeon RX 7900 XT runs Qwen 3.6 27B natively in VRAM at Q3_K_M quantization, achieving approximately 35.6 tokens per second.
Can the AMD Radeon RX 7900 XT run Llama 3.1 8B Instruct?: Yes. The AMD Radeon RX 7900 XT runs Llama 3.1 8B Instruct natively in VRAM at Q8_0 quantization, achieving approximately 54.3 tokens per second.