How much VRAM does the Intel Arc B580 12GB have?

The Intel Arc B580 12GB has 12 GB of GDDR6 with 456 GB/s memory bandwidth.

What is the Intel Arc B580 12GB best for?

With 12 GB of VRAM, the Intel Arc B580 12GB is best for running compact models (1B–8B) at low quantization, suitable for edge inference, prototyping, and lightweight tasks.

What LLMs can the Intel Arc B580 12GB run locally?

The Intel Arc B580 12GB can run 28 of the 70 open-weight models tracked by CanItRun natively in VRAM at 8k context. Top options include: Llama 3.1 8B Instruct at Q8_0, Llama 3.2 3B Instruct at BF16, Llama 3.2 1B Instruct at FP32.

Can the Intel Arc B580 12GB run Llama 3.3 70B Instruct?

The Intel Arc B580 12GB can run Llama 3.3 70B Instruct with CPU offload at Q3_K_M quantization, but inference will be slower than native VRAM execution.

Can the Intel Arc B580 12GB run Qwen 3.6 27B?

The Intel Arc B580 12GB can run Qwen 3.6 27B with CPU offload at Q8_0 quantization, but inference will be slower than native VRAM execution.

Can the Intel Arc B580 12GB run Llama 3.1 8B Instruct?

Yes. The Intel Arc B580 12GB runs Llama 3.1 8B Instruct natively in VRAM at Q8_0 quantization, achieving approximately 57 tokens per second.

Intel Arc B580 12GB

The Intel Arc B580 12GB has 12 GB VRAM and 456 GB/s memory bandwidth. It can run 28 of our 70 tracked models natively in VRAM at 8k context.

The Intel Arc B580 is Intel's first Battlemage discrete GPU, released in late 2024 with 12GB of GDDR6 on a 192-bit bus at 456 GB/s. It offers improved ray tracing and AI acceleration over Alchemist and fits 7B models comfortably at common quantizations. The 12GB VRAM is a meaningful upgrade over most similarly-priced NVIDIA/AMD alternatives at launch.

Intel Arc B580 12GB: 2024 Xe2-HPG Battlemage with 12GB GDDR6 at 456 GB/s — Intel's best consumer LLM card.

7B at Q8 or Q4 natively; 13B at Q4 with headroom. ~6-10 t/s for 7B via Vulkan.

Vulkan via llama.cpp works cross-platform. SYCL backend available with oneAPI. Ollama support limited.

Vendor	Intel
Architecture	Xe2-HPG (Battlemage)
VRAM	12 GB
Memory type	GDDR6
Memory bandwidth	456 GB/s
Compute backend	VULKAN
Tier	Consumer
Released	2024
Models (native)	28 / 70
Models (offload)	19 / 70

Software: Vulkan backend works in llama.cpp; SYCL backend available but requires oneAPI toolkit. Ollama support is limited.

Models this GPU runs natively in VRAM (28)

Models that fit with CPU offload (19)

These use system RAM for layers that don't fit in VRAM — expect much slower inference.

Too large for this GPU (23)

Frequently asked questions

How much VRAM does the Intel Arc B580 12GB have?: The Intel Arc B580 12GB has 12 GB of GDDR6 with 456 GB/s memory bandwidth.
What is the Intel Arc B580 12GB best for?: With 12 GB of VRAM, the Intel Arc B580 12GB is best for running compact models (1B–8B) at low quantization, suitable for edge inference, prototyping, and lightweight tasks.
What LLMs can the Intel Arc B580 12GB run locally?: The Intel Arc B580 12GB can run 28 of the 70 open-weight models tracked by CanItRun natively in VRAM at 8k context. Top options include: Llama 3.1 8B Instruct at Q8_0, Llama 3.2 3B Instruct at BF16, Llama 3.2 1B Instruct at FP32.
Can the Intel Arc B580 12GB run Llama 3.3 70B Instruct?: The Intel Arc B580 12GB can run Llama 3.3 70B Instruct with CPU offload at Q3_K_M quantization, but inference will be slower than native VRAM execution.
Can the Intel Arc B580 12GB run Qwen 3.6 27B?: The Intel Arc B580 12GB can run Qwen 3.6 27B with CPU offload at Q8_0 quantization, but inference will be slower than native VRAM execution.
Can the Intel Arc B580 12GB run Llama 3.1 8B Instruct?: Yes. The Intel Arc B580 12GB runs Llama 3.1 8B Instruct natively in VRAM at Q8_0 quantization, achieving approximately 57 tokens per second.