Frontier LLMs
24models · local AI VRAM requirements & GPU compatibility
Frontier models push the boundary of open-weight AI capability. They typically require a multi-GPU server or extreme-tier workstation — but when they fit, they rival the best proprietary APIs. Check the compatible GPU list carefully; most of these require 80 GB+ of VRAM across multiple cards.
- DeepSeek V4 Pro 1.6TDeepSeek · 1600B params (49B active)897.1 GBQ4_K_M
- Kimi K2.6Moonshot AI · 1000B params (32B active)563.0 GBQ4_K_M
- GLM-5.1 754BZ.ai · 754B params (44B active)434.0 GBQ4_K_M
- GLM-5 744BZ.ai · 744B params (40B active)428.4 GBQ4_K_M
- DeepSeek V3 671BDeepSeek · 671B params (37B active)376.3 GBQ4_K_M
- DeepSeek R1 671BDeepSeek · 671B params (37B active)376.3 GBQ4_K_M
- MiniMax M1 456BMiniMax · 456B params (46B active)258.4 GBQ4_K_M
- Llama 3.1 405B InstructMeta · 405B params231.5 GBQ4_K_M
- Llama 4 Maverick 400BMeta · 400B params (17B active)228.5 GBQ4_K_M
- GLM-4.7 358BZ.ai · 358B params (32B active)203.9 GBQ4_K_M
- GLM-4.5 355BZ.ai · 355B params (32B active)202.3 GBQ4_K_M
- GLM-4.6 355BZ.ai · 355B params (32B active)202.3 GBQ4_K_M
- DeepSeek V4 Flash 284BDeepSeek · 284B params (13B active)159.8 GBQ4_K_M
- Qwen3 235B-A22B (MoE)Alibaba · 235B params (22B active)133.4 GBQ4_K_M
- MiniMax M2.5 229BMiniMax · 229B params (10B active)130.6 GBQ4_K_M
- MiniMax M2.7 229BMiniMax · 229B params (10B active)130.6 GBQ4_K_M
- Qwen 3.5 122B-A10B (MoE)Alibaba · 122B params (10B active)70.7 GBQ4_K_Mfits 80 GB
- Nemotron 3 Super 120BNVIDIA · 120B params (12B active)68.0 GBQ4_K_Mfits 80 GB
- GPT-OSS 120BOpenAI · 117B params (5B active)66.2 GBQ4_K_Mfits 80 GB
- Llama 4 Scout 109BMeta · 109B params (17B active)64.0 GBQ4_K_Mfits 80 GB
- GLM-4.5 Air 106BZ.ai · 106B params (12B active)61.1 GBQ4_K_Mfits 80 GB
- GLM-4.6V 106BZ.ai · 106B params (12B active)61.1 GBQ4_K_Mfits 80 GB
- Nemotron 3 Nano 30BNVIDIA · 32B params (3B active)18.4 GBQ4_K_Mfits 24 GB
- GPT-OSS 20BOpenAI · 21B params (4B active)12.2 GBQ4_K_Mfits 16 GB
Want to check your specific GPU? Use the homepage calculator to see which of these models fit your hardware with estimated tokens per second.