DeepSeek V3 671B vs Qwen3 235B-A22B (MoE)
Side-by-side VRAM requirements, benchmark scores, and GPU compatibility for local AI inference.
Quick verdict
Qwen3 235B-A22B (MoE) is more hardware-efficient — it needs 149.9 GB at Q4_K_M vs 423.7 GB for DeepSeek V3 671B, fitting on 20 GPUs natively.
VRAM at each quantization (8k context)
| Quant | DeepSeek V3 671B | Qwen3 235B-A22B (MoE) | Diff |
|---|---|---|---|
| FP32 | 3006.7 GB | 1054.6 GB | +185% |
| BF16 | 1503.6 GB | 528.2 GB | +185% |
| FP16 | 1503.6 GB | 528.2 GB | +185% |
| Q8_0 | 752.1 GB | 265.0 GB | +184% |
| Q6_K | 616.8 GB | 217.6 GB | +183% |
| Q5_K_M | 484.6 GB | 171.3 GB | +183% |
| Q4_K_M | 423.7 GB | 149.9 GB | +183% |
| Q3_K_M | 323.7 GB | 114.9 GB | +182% |
| Q2_K | 247.8 GB | 88.4 GB | +180% |
| NVFP4 | 376.3 GB | 133.4 GB | +182% |
Diff is DeepSeek V3 671B relative to Qwen3 235B-A22B (MoE). Green = lower VRAM (fits more GPUs).
Model specifications
| Spec | DeepSeek V3 671B | Qwen3 235B-A22B (MoE) |
|---|---|---|
| Org | DeepSeek | Alibaba |
| Parameters | 671B | 235B |
| Architecture | MoE (37B active) | MoE (22B active) |
| Context | 125k tokens | 128k tokens |
| Modalities | text | text |
| License | MIT | Apache 2.0 |
| Commercial | Yes | Yes |
| Released | 2024-12-27 | 2025-04-29 |
| GPUs (native) | 4 / 107 | 20 / 107 |
Benchmark scores
| Benchmark | DeepSeek V3 671B | Qwen3 235B-A22B (MoE) |
|---|---|---|
| MMLU-Pro | 75.9 | 84.4 |
| GPQA Diamond | 59.1 | — |
| IFEval | 86.1 | — |
| MATH | 90.2 | — |
| LiveCodeBench | 40.5 | — |
Green = higher score (better). — = not yet available.
GPUs that run only DeepSeek V3 671B(0)
Every GPU that runs DeepSeek V3 671B also runs Qwen3 235B-A22B (MoE).
GPUs that run only Qwen3 235B-A22B (MoE)(16)
- NVIDIA RTX Pro 600096 GB
- NVIDIA DGX Spark (128GB)128 GB
- AMD Instinct MI300X192 GB
- AMD Strix Halo (128GB)128 GB
- AMD Strix Halo (96GB)96 GB
- Apple M5 Max (128GB)128 GB
- Apple M4 Ultra (192GB)192 GB
- Apple M4 Max (128GB)128 GB
- Apple M4 Max (96GB)96 GB
- Apple M3 Ultra (96GB)96 GB
- +6 more
GPUs that run both natively(4)
- Apple M4 Ultra (384GB)384 GB
- Apple M3 Ultra (512GB)512 GB
- Apple M3 Ultra (256GB)256 GB
- Apple M2 Ultra (384GB)384 GB
Which should you use?
Choose DeepSeek V3 671B if:
- • You want maximum capability and have a 424 GB+ GPU
Choose Qwen3 235B-A22B (MoE) if:
- • You have limited VRAM — it's a smaller model needing 149.9 GB vs 423.7 GB
- • Long context matters — it supports 128k tokens vs 125k
- • Benchmark quality matters — scores 84.4 vs 75.9 on MMLU-Pro
- • You need chain-of-thought reasoning
Frequently asked questions
- Which is better, DeepSeek V3 671B or Qwen3 235B-A22B (MoE)?
- DeepSeek V3 671B has 671B parameters vs 235B for Qwen3 235B-A22B (MoE), so DeepSeek V3 671B is the larger model. Qwen3 235B-A22B (MoE) is more hardware-efficient, needing 149.9 GB at Q4_K_M vs 423.7 GB. Qwen3 235B-A22B (MoE) runs on more GPUs natively (20 vs 4). On MMLU-Pro, Qwen3 235B-A22B (MoE) scores higher (84.4 vs 75.9).
- How much VRAM does DeepSeek V3 671B need vs Qwen3 235B-A22B (MoE)?
- At Q4_K_M quantization with 8k context, DeepSeek V3 671B needs approximately 423.7 GB of VRAM, while Qwen3 235B-A22B (MoE) needs 149.9 GB. At FP16, DeepSeek V3 671B requires 1503.6 GB vs 528.2 GB for Qwen3 235B-A22B (MoE).
- Can you run DeepSeek V3 671B on the same GPUs as Qwen3 235B-A22B (MoE)?
- Yes, 4 GPUs can run both natively in VRAM, including Apple M4 Ultra (384GB), Apple M3 Ultra (512GB), Apple M3 Ultra (256GB). However, no GPU can run DeepSeek V3 671B without also fitting Qwen3 235B-A22B (MoE), and 16 GPUs can run Qwen3 235B-A22B (MoE) but not DeepSeek V3 671B.
- What is the difference between DeepSeek V3 671B and Qwen3 235B-A22B (MoE)?
- DeepSeek V3 671B has 671B parameters (37B active, MoE) with a 125k context window. Qwen3 235B-A22B (MoE) has 235B parameters (22B active, MoE) with a 128k context window. Licensing differs: DeepSeek V3 671B is MIT while Qwen3 235B-A22B (MoE) is Apache 2.0.
- Which model fits in 24 GB of VRAM, DeepSeek V3 671B or Qwen3 235B-A22B (MoE)?
- Neither fits in 24 GB at Q4_K_M — DeepSeek V3 671B needs 423.7 GB and Qwen3 235B-A22B (MoE) needs 149.9 GB. Both require at least a 48 GB GPU.