CanItRun Logocanitrun.

Reasoning LLMs

21models · local AI VRAM requirements & GPU compatibility

Reasoning models produce long chains-of-thought before answering, which means higher quality on math, science, and multi-step tasks — but also longer outputs and higher KV-cache VRAM at long contexts. If you're running these locally, prioritize GPUs with more VRAM and high memory bandwidth to sustain token generation through lengthy reasoning traces.

Want to check your specific GPU? Use the homepage calculator to see which of these models fit your hardware with estimated tokens per second.