Intel Arc Pro B70 24GB

The Intel Arc Pro B70 24GB has 24 GB VRAM and 456 GB/s memory bandwidth. It can run 42 of our 71 tracked models natively in VRAM at 8k context.

With 24 GB GDDR6, the Intel Arc Pro B70 24GB is a workstation-tier GPU that can run 42 models natively. It handles 70B-class models at Q4 quantization.

The Intel Arc Pro B70 is Intel's flagship Battlemage workstation GPU with 24GB of GDDR6 and ECC support, announced at CES 2025. It targets CAD, media, and AI inference workloads with ISV certifications. The 24GB framebuffer fits 13B models at Q8 or 30B models at aggressive quantization, with the Vulkan backend providing usable LLM inference speeds on Linux and Windows.

Intel Arc Pro B70 24GB: 2025 Xe2-HPG Battlemage workstation GPU with 24GB ECC GDDR6 — Intel's pro flagship.

13B at Q8 or 30B at Q4 natively. ~8-12 t/s for 7B via Vulkan.

Vulkan via llama.cpp works cross-platform. SYCL backend available with oneAPI. ISV-certified workstation card.

Vendor	Intel
Architecture	Xe2-HPG (Battlemage)
VRAM	24 GB
Memory type	GDDR6
Memory bandwidth	456 GB/s
Compute backend	VULKAN
Tier	Workstation
Released	2025
Models (native)	42 / 71
Models (offload)	11 / 71

Software: Vulkan backend works in llama.cpp; SYCL backend available with oneAPI toolkit. Primarily a workstation/professional card.

Popular models for this GPU

Mixtral 8x7B Instruct v0.1 Qwen 3.5 35B-A3B (MoE)Qwen 3.6 35B Yi 1.5 34B Chat Qwen3 32B

Models this GPU runs natively in VRAM (42)

Models that fit with CPU offload (11)

These use system RAM for layers that don't fit in VRAM — expect much slower inference.

Too large for this GPU (18)

Models mentioned

Mixtral 8x7B Instruct v0.146.7B (12.9B active)

Mistral AIQ4_K_M rec.

Qwen 3.5 35B-A3B (MoE)35B (3B active)

AlibabaQ4_K_M rec.

Qwen 3.6 35B35B

AlibabaQ4_K_M rec.

Yi 1.5 34B Chat34.4B

01.AIQ4_K_M rec.

AlibabaQ4_K_M rec.

Qwen 2.5 32B Instruct32.5B

AlibabaQ4_K_M rec.

Qwen 2.5 Coder 32B Instruct32.5B

AlibabaQ4_K_M rec.

DeepSeek R1 Distill Qwen 32B32.5B

DeepSeekQ4_K_M rec.

Nemotron 3 Nano 30B32B (3B active)

NVIDIAQ5_K_M rec.

GoogleQ4_K_M rec.

Qwen3 30B-A3B (MoE)30B (3B active)

AlibabaQ4_K_M rec.

Gemma 2 27B Instruct27.2B

GoogleQ4_K_M rec.

Gemma 3 27B Instruct27B

GoogleQ4_K_M rec.

Qwen 3.6 27B27B

AlibabaQ4_K_M rec.

Gemma 4 26B (MoE)26B (3.8B active)

GoogleQ4_K_M rec.

Mistral Small 3.1 24B Instruct24B

Mistral AIQ4_K_M rec.

Mistral Small 22B22.2B

Mistral AIQ4_K_M rec.

GPT-OSS 20B21B (4B active)

OpenAIQ5_K_M rec.

AlibabaQ5_K_M rec.

Qwen 2.5 14B Instruct14.7B

AlibabaQ5_K_M rec.

Phi-4 14B Instruct14B

MicrosoftQ5_K_M rec.

Mistral Nemo 12B Instruct12.2B

Mistral AIQ5_K_M rec.

Gemma 3 12B Instruct12.2B

GoogleQ5_K_M rec.

Gemma 2 9B Instruct9.2B

GoogleQ5_K_M rec.

Llama 3.1 8B Instruct8B

MetaQ5_K_M rec.

DeepSeek R1 Distill Llama 8B8B

DeepSeekQ5_K_M rec.

AlibabaQ5_K_M rec.

Qwen 2.5 7B Instruct7.6B

AlibabaQ6_K rec.

Mistral 7B Instruct v0.37.25B

Mistral AIQ6_K rec.

Gemma 3 4B Instruct4B

GoogleQ6_K rec.

GoogleQ5_K_M rec.

Phi-3.5 Mini Instruct3.8B

MicrosoftQ6_K rec.

Llama 3.2 3B Instruct3.2B

Qwen 2.5 3B Instruct3.1B

AlibabaQ6_K rec.

Gemma 2 2B Instruct2.6B

GoogleQ8_0 rec.

GoogleQ8_0 rec.

SmolLM2 1.7B Instruct1.7B

Hugging FaceQ8_0 rec.

Qwen 2.5 1.5B Instruct1.5B

AlibabaQ8_0 rec.

Llama 3.2 1B Instruct1.24B

Gemma 3 1B Instruct1B

GoogleQ8_0 rec.

Qwen 2.5 0.5B Instruct0.5B

AlibabaQ8_0 rec.

SmolLM2 360M Instruct0.36B

Hugging FaceQ8_0 rec.

Frequently asked questions

How much VRAM does the Intel Arc Pro B70 24GB have?: The Intel Arc Pro B70 24GB has 24 GB of GDDR6 with 456 GB/s memory bandwidth.
What is the Intel Arc Pro B70 24GB best for?: With 24 GB of VRAM, the Intel Arc Pro B70 24GB is well-suited for running 7B–32B models at Q4 with room for context, making it a great all-rounder for local LLM inference.
What LLMs can the Intel Arc Pro B70 24GB run locally?: The Intel Arc Pro B70 24GB can run 42 of the 71 open-weight models tracked by CanItRun natively in VRAM at 8k context. Top options include: Llama 3.1 8B Instruct at BF16, Llama 3.2 3B Instruct at FP32, Llama 3.2 1B Instruct at FP32.
Can the Intel Arc Pro B70 24GB run Llama 3.3 70B Instruct?: The Intel Arc Pro B70 24GB can run Llama 3.3 70B Instruct with CPU offload at Q4_K_M quantization, but inference will be slower than native VRAM execution.
Can the Intel Arc Pro B70 24GB run Qwen 3.6 27B?: Yes. The Intel Arc Pro B70 24GB runs Qwen 3.6 27B natively in VRAM at Q5_K_M quantization, achieving approximately 26.2 tokens per second.
Can the Intel Arc Pro B70 24GB run Llama 3.1 8B Instruct?: Yes. The Intel Arc Pro B70 24GB runs Llama 3.1 8B Instruct natively in VRAM at BF16 quantization, achieving approximately 28.5 tokens per second.