DeepSeek V4 Pro 1.6T
DeepSeek V4 Pro 1.6T needs roughly 897.1GB VRAM at Q4_K_M quantization (3585.2GB at FP16). 0 GPUs we track can run it fully in VRAM at 8k context.
DeepSeek1600B params49B active (MoE)1024k contextMITCommercial use ok
VRAM at each quantization
Assumes 8k context. KV cache grows linearly with context length.
| Quant | Weights | KV cache | Total |
|---|---|---|---|
| FP16 | 3200.0 GB | 1.02 GB | 3585.2 GB |
| Q8 | 1600.0 GB | 1.02 GB | 1793.2 GB |
| Q6_K | 1200.0 GB | 1.02 GB | 1345.2 GB |
| Q5_K_M | 1000.0 GB | 1.02 GB | 1121.2 GB |
| Q4_K_M | 800.0 GB | 1.02 GB | 897.1 GB |
| Q3_K_M | 640.0 GB | 1.02 GB | 718.0 GB |
| Q2_K | 480.0 GB | 1.02 GB | 538.8 GB |
Benchmarks
GPUs that run DeepSeek V4 Pro 1.6T natively (0)
No single GPU in our list fits this model at Q4 with 8k context. Try multi-GPU or CPU offload.
Notes
1.6T MoE with hybrid CSA/HCA attention and 1M token context. Requires 27% of V3.2's inference FLOPs at 1M context; kvHeads/headDim approximates MLA storage.