General-Purpose LLMs
26models · local AI VRAM requirements & GPU compatibility
General-purpose models strike a balance across chat, instruction-following, coding, and reasoning tasks. They're the most versatile choice for local AI and are tested across a broad set of benchmarks. If you're not sure which model to run, start here.
- Llama 4 Scout 109BMeta · 109B params (17B active)64.0 GBQ4_K_Mfits 80 GB
- Qwen 2.5 72B InstructAlibaba · 72B params43.3 GBQ4_K_Mfits 48 GB
- Llama 3.3 70B InstructMeta · 70B params42.2 GBQ4_K_Mfits 48 GB
- Llama 3.1 70B InstructMeta · 70B params42.2 GBQ4_K_Mfits 48 GB
- Qwen 3.6 35BAlibaba · 35B params22.0 GBQ4_K_Mfits 24 GB
- Yi 1.5 34B Chat01.AI · 34.4B params21.5 GBQ4_K_Mfits 24 GB
- Qwen3 32BAlibaba · 32.8B params19.9 GBQ4_K_Mfits 24 GB
- Qwen 2.5 32B InstructAlibaba · 32.5B params20.6 GBQ4_K_Mfits 24 GB
- Gemma 4 31BGoogle · 31B params21.0 GBQ4_K_Mfits 24 GB
- Gemma 2 27B InstructGoogle · 27.2B params18.7 GBQ4_K_Mfits 24 GB
- Gemma 3 27B InstructGoogle · 27B params16.8 GBQ4_K_Mfits 24 GB
- Qwen 3.6 27BAlibaba · 27B params16.9 GBQ4_K_Mfits 24 GB
- Gemma 4 26B (MoE)Google · 26B params (3.8B active)16.1 GBQ4_K_Mfits 24 GB
- Mistral Small 3.1 24B InstructMistral AI · 24B params14.9 GBQ4_K_Mfits 16 GB
- Mistral Small 22BMistral AI · 22.2B params14.5 GBQ4_K_Mfits 16 GB
- Qwen3 14BAlibaba · 14.8B params9.8 GBQ4_K_Mfits 12 GB
- Qwen 2.5 14B InstructAlibaba · 14.7B params10.0 GBQ4_K_Mfits 12 GB
- Phi-4 14B InstructMicrosoft · 14B params9.3 GBQ4_K_Mfits 12 GB
- Mistral Nemo 12B InstructMistral AI · 12.2B params8.3 GBQ4_K_Mfits 12 GB
- Gemma 3 12B InstructGoogle · 12.2B params8.0 GBQ4_K_Mfits 12 GB
- Gemma 2 9B InstructGoogle · 9.2B params8.3 GBQ4_K_Mfits 12 GB
- Llama 3.1 8B InstructMeta · 8B params5.7 GBQ4_K_Mfits 8 GB
- Qwen3 8BAlibaba · 8B params5.8 GBQ4_K_Mfits 8 GB
- Qwen 2.5 7B InstructAlibaba · 7.6B params4.8 GBQ4_K_Mfits 8 GB
- Mistral 7B Instruct v0.3Mistral AI · 7.25B params5.3 GBQ4_K_Mfits 8 GB
- Gemma 3 4B InstructGoogle · 4B params2.8 GBQ4_K_Mfits 8 GB
Want to check your specific GPU? Use the homepage calculator to see which of these models fit your hardware with estimated tokens per second.