DeepSeek V3 671B vs Qwen3 235B-A22B (MoE)
Side-by-side VRAM requirements, benchmark scores, and GPU compatibility for local AI inference.
Quick verdict
Qwen3 235B-A22B (MoE) is more hardware-efficient — it needs 133.4 GB at Q4_K_M vs 376.3 GB for DeepSeek V3 671B, fitting on 14 GPUs natively.
VRAM at each quantization (8k context)
| Quant | DeepSeek V3 671B | Qwen3 235B-A22B (MoE) | Diff |
|---|---|---|---|
| FP16 | 1503.6 GB | 528.2 GB | +185% |
| Q8 | 752.1 GB | 265.0 GB | +184% |
| Q6_K | 564.2 GB | 199.2 GB | +183% |
| Q5_K_M | 470.3 GB | 166.3 GB | +183% |
| Q4_K_M | 376.3 GB | 133.4 GB | +182% |
| Q3_K_M | 301.2 GB | 107.0 GB | +181% |
| Q2_K | 226.0 GB | 80.7 GB | +180% |
Diff is DeepSeek V3 671B relative to Qwen3 235B-A22B (MoE). Green = lower VRAM (fits more GPUs).
Model specifications
| Spec | DeepSeek V3 671B | Qwen3 235B-A22B (MoE) |
|---|---|---|
| Org | DeepSeek | Alibaba |
| Parameters | 671B | 235B |
| Architecture | MoE (37B active) | MoE (22B active) |
| Context | 125k tokens | 128k tokens |
| Modalities | text | text |
| License | MIT | Apache 2.0 |
| Commercial | Yes | Yes |
| Released | 2024-12-27 | 2025-04-29 |
| GPUs (native) | 2 / 67 | 14 / 67 |
Benchmark scores
Green = higher score (better). — = not yet available.
GPUs that run only DeepSeek V3 671B(0)
Every GPU that runs DeepSeek V3 671B also runs Qwen3 235B-A22B (MoE).
GPUs that run only Qwen3 235B-A22B (MoE)(12)
- NVIDIA DGX Spark (128GB)128 GB
- AMD Instinct MI300X192 GB
- AMD Strix Halo (128GB)128 GB
- AMD Strix Halo (96GB)96 GB
- Apple M4 Ultra (192GB)192 GB
- Apple M4 Max (128GB)128 GB
- Apple M4 Max (96GB)96 GB
- Apple M3 Max (128GB)128 GB
- Apple M3 Max (96GB)96 GB
- Apple M2 Ultra (192GB)192 GB
- +2 more
GPUs that run both natively(2)
- Apple M4 Ultra (384GB)384 GB
- Apple M2 Ultra (384GB)384 GB
Which should you use?
Choose DeepSeek V3 671B if:
- • You want maximum capability and have a 377 GB+ GPU
Choose Qwen3 235B-A22B (MoE) if:
- • You have limited VRAM — it's a smaller model needing 133.4 GB vs 376.3 GB
- • Long context matters — it supports 128k tokens vs 125k
- • You need chain-of-thought reasoning
Frequently asked questions
- Which is better, DeepSeek V3 671B or Qwen3 235B-A22B (MoE)?
- DeepSeek V3 671B has 671B parameters vs 235B for Qwen3 235B-A22B (MoE), so DeepSeek V3 671B is the larger model. Qwen3 235B-A22B (MoE) is more hardware-efficient, needing 133.4 GB at Q4_K_M vs 376.3 GB. Qwen3 235B-A22B (MoE) runs on more GPUs natively (14 vs 2).
- How much VRAM does DeepSeek V3 671B need vs Qwen3 235B-A22B (MoE)?
- At Q4_K_M quantization with 8k context, DeepSeek V3 671B needs approximately 376.3 GB of VRAM, while Qwen3 235B-A22B (MoE) needs 133.4 GB. At FP16, DeepSeek V3 671B requires 1503.6 GB vs 528.2 GB for Qwen3 235B-A22B (MoE).
- Can you run DeepSeek V3 671B on the same GPUs as Qwen3 235B-A22B (MoE)?
- Yes, 2 GPUs can run both natively in VRAM, including Apple M4 Ultra (384GB), Apple M2 Ultra (384GB). However, no GPU can run DeepSeek V3 671B without also fitting Qwen3 235B-A22B (MoE), and 12 GPUs can run Qwen3 235B-A22B (MoE) but not DeepSeek V3 671B.
- What is the difference between DeepSeek V3 671B and Qwen3 235B-A22B (MoE)?
- DeepSeek V3 671B has 671B parameters (37B active, MoE) with a 125k context window. Qwen3 235B-A22B (MoE) has 235B parameters (22B active, MoE) with a 128k context window. Licensing differs: DeepSeek V3 671B is MIT while Qwen3 235B-A22B (MoE) is Apache 2.0.
- Which model fits in 24 GB of VRAM, DeepSeek V3 671B or Qwen3 235B-A22B (MoE)?
- Neither fits in 24 GB at Q4_K_M — DeepSeek V3 671B needs 376.3 GB and Qwen3 235B-A22B (MoE) needs 133.4 GB. Both require at least a 48 GB GPU.