CanItRun Logocanitrun.

CPU only (system RAM)

The CPU only (system RAM) has 0 GB VRAM and 80 GB/s memory bandwidth. It can run 0 of our 70 tracked models natively in VRAM at 8k context.

With 0 GB DDR4 / DDR5, the CPU only (system RAM) is a integrated-tier GPU that can run 0 models natively. It's best for smaller models under 8B parameters.

CPU only (system RAM): x86-64/ARM with DDR4/DDR5 at ~80 GB/s — CPU inference fallback.

7B at Q4 ~1-5 t/s depending on AVX2/AVX-512 support. 14B ~0.5-2 t/s.

llama.cpp CPU backend. AVX2 or AVX-512 recommended. Expect 1-5 t/s for 7B on modern desktop CPU.

VendorGeneric
Architecturex86-64 / ARM
VRAM0 GB
Memory typeDDR4 / DDR5
Memory bandwidth80 GB/s
Compute backendCPU
TierIntegrated
Released2024
Models (native)0 / 70
Models (offload)43 / 70
Software: llama.cpp CPU backend. AVX2 or AVX-512 recommended. Expect 1–5 t/s for 7B models on a modern desktop CPU.

Models this GPU runs natively in VRAM (0)

None.

Models that fit with CPU offload (43)

These use system RAM for layers that don't fit in VRAM — expect much slower inference.

Too large for this GPU (27)

Frequently asked questions

How much VRAM does the CPU only (system RAM) have?
The CPU only (system RAM) has 0 GB of DDR4 / DDR5 with 80 GB/s memory bandwidth.
What is the CPU only (system RAM) best for?
With 0 GB of VRAM, the CPU only (system RAM) is best for running compact models (1B–8B) at low quantization, suitable for edge inference, prototyping, and lightweight tasks.
What LLMs can the CPU only (system RAM) run locally?
The CPU only (system RAM) cannot run any of the 70 tracked models fully in VRAM at 8k context. It may handle smaller models with CPU offload.
Can the CPU only (system RAM) run Llama 3.3 70B Instruct?
The CPU only (system RAM) does not have enough VRAM to run Llama 3.3 70B Instruct. You would need more VRAM or a lower quantization level.
Can the CPU only (system RAM) run Qwen 3.6 27B?
The CPU only (system RAM) can run Qwen 3.6 27B with CPU offload at Q6_K quantization, but inference will be slower than native VRAM execution.
Can the CPU only (system RAM) run Llama 3.1 8B Instruct?
The CPU only (system RAM) can run Llama 3.1 8B Instruct with CPU offload at BF16 quantization, but inference will be slower than native VRAM execution.