CanItRun Logocanitrun.

Qwen 3.6 35B

Qwen 3.6 35B needs roughly 22.0GB VRAM at Q4_K_M quantization (80.8GB at FP16). 49 GPUs we track can run it fully in VRAM at 8k context.

Alibaba35B params256k contextApache 2.0Commercial use ok

VRAM at each quantization

Assumes 8k context. KV cache grows linearly with context length.

QuantWeightsKV cacheTotal
FP1670.0 GB2.15 GB80.8 GB
Q835.0 GB2.15 GB41.6 GB
Q6_K26.3 GB2.15 GB31.8 GB
Q5_K_M21.9 GB2.15 GB26.9 GB
Q4_K_M17.5 GB2.15 GB22.0 GB
Q3_K_M14.0 GB2.15 GB18.1 GB
Q2_K10.5 GB2.15 GB14.2 GB

Benchmarks

Benchmarks for this model are not yet available on the Open LLM Leaderboard v2. This is common for recently released models. Check back soon.

GPUs that run Qwen 3.6 35B natively (49)

Plus 6 GPUs that run it with CPU offload (slower)

Notes

Larger dense sibling of Qwen3.6-27B; same reasoning and agentic capabilities.

Hugging Face ↗Ollama ↗Released 2026-04-01