CanItRun Logocanitrun.
← All apps

AnythingLLM

All-in-one local AI workspace with best-in-class RAG. Upload documents, chat with your files, build no-code AI agents, and connect 30+ LLM providers.

App type

Productivity, Self Hosted

Local models

Yes

OpenRouter

Yes

Ollama

Yes

GPU required

Only for local models

Best for

Chat with your documents (PDFs, codebases, documentation)

Setup difficulty

Easy

Platforms

macOS, Windows, Linux, Docker

Pricing

Open source — free

AnythingLLM is All-in-one local AI workspace with best-in-class RAG. Upload documents, chat with your files, build no-code AI agents, and connect 30+ LLM providers. AnythingLLM (53K+ GitHub stars) is the most complete local AI workspace tool.

AnythingLLM runs entirely on your local hardware. It supports OpenRouter for unified access to 300+ models from a single API. Ollama integration lets you run models locally on your own GPU. AnythingLLM is open source (https://github.com/Mintplex-Labs/anything-llm), so you can inspect the code and self-host. Desktop app runs on standard hardware — no GPU required. For local RAG with 7B models: 16 GB system RAM. Embedding model (nomic-embed-text) is lightweight.

Can it run on my hardware?

Minimum

Desktop app runs on standard hardware — no GPU required. For local RAG with 7B models: 16 GB system RAM. Embedding model (nomic-embed-text) is lightweight.

Recommended

16 GB system RAM + 8 GB VRAM for local RAG with 7B models. 24 GB VRAM for 32B models with RAG. For production RAG with best quality, use OpenRouter with Claude or GPT models — GPU not needed.

Approximate VRAM needed for recommended local models at Q4 with 8K context:

ModelParamsQ4 VRAMMin GPU
Qwen3 32B32.8B~22.2 GB24 GB
Llama 3.1 8B Instruct8B~6.3 GB8 GB
Gemma 3 12B Instruct12.2B~8.9 GB12 GB
Qwen3 14B14.8B~10.8 GB12 GB
Mistral Nemo 12B Instruct12.2B~9.2 GB12 GB

Check your GPU against these models in the calculator →

App compatibility

FeatureSupported
Local modelsYes
OpenRouterYes
OpenAI-compatible APIYes
OllamaYes
LM StudioYes
Anthropic APIYes
Google APIYes
Mistral APINo
DockerYes
Works offlineYes
Needs GPUNo

Recommended models

Best local models

Best cloud/API models

Local vs cloud: which should you use?

Use local models if

  • You want privacy — data never leaves your machine
  • You already have a GPU with sufficient VRAM
  • You want zero per-token API costs
  • You need offline access

Use cloud/API if

  • Your GPU has insufficient VRAM for the models you need
  • You want access to frontier model quality
  • You need maximum coding/reasoning performance
  • You don't want to manage local model downloads and updates
  • OpenRouter lets you switch between 300+ models with one API key

Setup overview

Setting up AnythingLLM is straightforward. It runs on macos, windows, linux, docker. Full documentation is available at https://docs.anythingllm.com.

Limitations

  • Coding agent workflows (use Cline or Aider instead)
  • Simple chat (use Open WebUI if you don't need RAG)
  • Built-in LLM only — the built-in engine is basic, pair with Ollama or OpenRouter

Related

Recommended GPUs

Compatible models

Related apps

Frequently asked questions

What is AnythingLLM?
AnythingLLM is All-in-one local AI workspace with best-in-class RAG. Upload documents, chat with your files, build no-code AI agents, and connect 30+ LLM providers. AnythingLLM (53K+ GitHub stars) is the most complete local AI workspace tool.
Does AnythingLLM need a GPU?
AnythingLLM itself does not require a GPU. However, the models you connect to it do. Desktop app runs on standard hardware — no GPU required. For local RAG with 7B models: 16 GB system RAM. Embedding model (nomic-embed-text) is lightweight.
Can I run AnythingLLM on CPU only?
Yes — AnythingLLM supports CPU-only operation, but performance will be significantly slower (5-10x) compared to GPU inference. CPU-only works best for models under 7B parameters with at least 16 GB of system RAM.
Can AnythingLLM use OpenRouter?
Yes. AnythingLLM supports OpenRouter for accessing 300+ models through a single API. Configure OpenRouter as a provider in AnythingLLM's settings with your API key.
Can AnythingLLM use local models via Ollama?
Yes. AnythingLLM works with Ollama for running models locally. Install Ollama, pull your model (e.g., `ollama pull qwen2.5:7b`), and connect AnythingLLM to the local Ollama server. GPU requirements depend on the model you choose, not AnythingLLM itself.
Is AnythingLLM free and open source?
Yes. AnythingLLM is open source and completely free. You can find the source code on GitHub at https://github.com/Mintplex-Labs/anything-llm.