CanItRun Logocanitrun.

Gemma 3 27B Instruct

Gemma 3 27B Instruct needs roughly 18.8 GB VRAM at Q4_K_M quantization (62.2 GB at FP16). 85 GPUs we track can run it fully in VRAM at 8k context.

85 GPUs run this natively · 19 with CPU offload

Google27B params128k contextGemmaCommercial use ok

Gemma 3 27B Instruct is a 27B parameter dense model developed by Google. March 2025 multimodal model with native vision via SigLIP 400M encoder. 128K context with 5:1 local/global attention interleaving.

To run Gemma 3 27B Instruct locally: Q4_K_M needs ~32-33GB with context — 24GB GPU can run it but 32GB+ recommended for vision tasks. KV-cache optimization gives 60% memory reduction.

MMLU-Pro 67.5%, LiveCodeBench 29.7%, MMMU (vision) 64.9% — Chatbot Arena Elo 1338 ranks it best open non-thinking model.

VRAM at each quantization

Assumes 8k context. KV cache grows linearly with context length.

QuantWeightsKV cacheTotal
FP32108.0 GB1.54 GB122.7 GB
BF1654.0 GB1.54 GB62.2 GB
FP1654.0 GB1.54 GB62.2 GB
Q8_027.0 GB1.54 GB32.0 GB
Q6_K22.1 GB1.54 GB26.5 GB
Q5_K_M17.4 GB1.54 GB21.2 GB
Q4_K_Mrec15.2 GB1.54 GB18.8 GB
Q3_K_M11.6 GB1.54 GB14.7 GB
Q2_K8.9 GB1.54 GB11.7 GB
NVFP4cuda13.5 GB1.54 GB16.9 GB

KV cache shown at 8k context (FP16). NVFP4 requires a CUDA GPU. Enable TurboQuant in the calculator to see reduced KV cache estimates.

Benchmarks

GPUs that run Gemma 3 27B Instruct natively (85)

Plus 19 GPUs that run it with CPU offload (slower)

Notes

5:1 local/global attention interleaving.

Hugging Face ↗Ollama ↗Released 2025-03-12

Compare Gemma 3 27B Instruct with other models

Frequently asked questions

What are the VRAM requirements for Gemma 3 27B Instruct?
Gemma 3 27B Instruct requires approximately 18.8 GB of VRAM at Q4_K_M quantization, 32.0 GB at Q8, and 62.2 GB at FP16. These numbers assume 8k context window; VRAM scales linearly with context length due to the KV cache.
How many parameters does Gemma 3 27B Instruct have?
Gemma 3 27B Instruct has 27 billion parameters.
How capable is Gemma 3 27B Instruct?
With an MMLU-Pro score of 67.5, Gemma 3 27B Instruct delivers solid general-purpose performance suitable for most everyday tasks and professional use.
Can Gemma 3 27B Instruct run on a 16 GB GPU?
No. At Q4_K_M, Gemma 3 27B Instruct needs 18.8 GB of VRAM — more than 16 GB. You will need a 24 GB GPU like the RTX 4090 or RTX 3090.
Can Gemma 3 27B Instruct run on a 24 GB GPU?
Yes. Gemma 3 27B Instruct fits in a 24 GB GPU at Q4_K_M, requiring 18.8 GB VRAM. GPUs with 24 GB include the RTX 4090, RTX 3090, and RTX 3090 Ti.
What is the smallest quantization for Gemma 3 27B Instruct that fits in 24 GB of VRAM?
At NVFP4, Gemma 3 27B Instruct needs 16.8 GB — the highest-quality quantization that fits in 24 GB of VRAM.
What GPU do I need to run Gemma 3 27B Instruct locally?
A 24 GB GPU is the minimum. At Q4_K_M, Gemma 3 27B Instruct needs 18.8 GB VRAM. Good options: RTX 4090 (24 GB), RTX 3090 (24 GB).