General-Purpose LLMs
26models · local AI VRAM requirements & GPU compatibility
General-purpose models strike a balance across chat, instruction-following, coding, and reasoning tasks. They're the most versatile choice for local AI and are tested across a broad set of benchmarks. If you're not sure which model to run, start here.
- Llama 4 Scout 109BMeta · 109B params (17B active)71.7 GBQ4_K_Mfits 80 GB
- Qwen 2.5 72B InstructAlibaba · 72B params48.4 GBQ4_K_Mfits 80 GB
- Llama 3.3 70B InstructMeta · 70B params47.1 GBQ4_K_Mfits 48 GB
- Llama 3.1 70B InstructMeta · 70B params47.1 GBQ4_K_Mfits 48 GB
- Qwen 3.6 35BAlibaba · 35B params24.5 GBQ4_K_Mfits 48 GB
- Yi 1.5 34B Chat01.AI · 34.4B params23.9 GBQ4_K_Mfits 24 GB
- Qwen3 32BAlibaba · 32.8B params22.2 GBQ4_K_Mfits 24 GB
- Qwen 2.5 32B InstructAlibaba · 32.5B params22.9 GBQ4_K_Mfits 24 GB
- Gemma 4 31BGoogle · 31B params23.2 GBQ4_K_Mfits 24 GB
- Gemma 2 27B InstructGoogle · 27.2B params20.6 GBQ4_K_Mfits 24 GB
- Gemma 3 27B InstructGoogle · 27B params18.8 GBQ4_K_Mfits 24 GB
- Qwen 3.6 27BAlibaba · 27B params18.8 GBQ4_K_Mfits 24 GB
- Gemma 4 26B (MoE)Google · 26B params (3.8B active)18.0 GBQ4_K_Mfits 24 GB
- Mistral Small 3.1 24B InstructMistral AI · 24B params16.6 GBQ4_K_Mfits 24 GB
- Mistral Small 22BMistral AI · 22.2B params16.1 GBQ4_K_Mfits 24 GB
- Qwen3 14BAlibaba · 14.8B params10.8 GBQ4_K_Mfits 12 GB
- Qwen 2.5 14B InstructAlibaba · 14.7B params11.1 GBQ4_K_Mfits 12 GB
- Phi-4 14B InstructMicrosoft · 14B params10.3 GBQ4_K_Mfits 12 GB
- Mistral Nemo 12B InstructMistral AI · 12.2B params9.2 GBQ4_K_Mfits 12 GB
- Gemma 3 12B InstructGoogle · 12.2B params8.9 GBQ4_K_Mfits 12 GB
- Gemma 2 9B InstructGoogle · 9.2B params9.0 GBQ4_K_Mfits 12 GB
- Llama 3.1 8B InstructMeta · 8B params6.2 GBQ4_K_Mfits 8 GB
- Qwen3 8BAlibaba · 8B params6.4 GBQ4_K_Mfits 8 GB
- Qwen 2.5 7B InstructAlibaba · 7.6B params5.3 GBQ4_K_Mfits 8 GB
- Mistral 7B Instruct v0.3Mistral AI · 7.25B params5.8 GBQ4_K_Mfits 8 GB
- Gemma 3 4B InstructGoogle · 4B params3.1 GBQ4_K_Mfits 8 GB
Want to check your specific GPU? Use the homepage calculator to see which of these models fit your hardware with estimated tokens per second.