CanItRun Logocanitrun.
← All apps

SillyTavern

Self-hosted chat interface for AI roleplay and creative writing. Deep character creation, lorebooks, group chats, and first-class OpenRouter support.

App type

Chat Frontend, Roleplay

Local models

Yes

OpenRouter

Yes

Ollama

Yes

GPU required

Only for local models

Best for

AI character roleplay and interactive fiction

Setup difficulty

Medium

Platforms

Web, macOS, Linux, Windows

Pricing

Open source — free

SillyTavern is Self-hosted chat interface for AI roleplay and creative writing. Deep character creation, lorebooks, group chats, and first-class OpenRouter support. SillyTavern is the community standard for AI roleplay with 10K+ GitHub stars.

SillyTavern runs entirely on your local hardware. It supports OpenRouter for unified access to 300+ models from a single API. Ollama integration lets you run models locally on your own GPU. SillyTavern is open source (https://github.com/SillyTavern/SillyTavern), so you can inspect the code and self-host. SillyTavern itself has no GPU requirement — runs on a Raspberry Pi 4 with 2 GB RAM. All GPU requirements come from the model backend you connect. For local roleplay, 12 GB VRAM is sufficient for good 13B-class models.

Can it run on my hardware?

Minimum

SillyTavern itself has no GPU requirement — runs on a Raspberry Pi 4 with 2 GB RAM. All GPU requirements come from the model backend you connect. For local roleplay, 12 GB VRAM is sufficient for good 13B-class models.

Recommended

12 GB VRAM (RTX 3060) for Mistral Nemo 12B or Gemma 3 12B at Q4. 24 GB VRAM (RTX 3090) for Qwen3-32B at Q4 or Qwen3-235B-A22B MoE at IQ4. Pair with KoboldCPP for the best local roleplay experience.

Approximate VRAM needed for recommended local models at Q4 with 8K context:

ModelParamsQ4 VRAMMin GPU
Qwen3 32B32.8B~22.2 GB24 GB
Mistral Nemo 12B Instruct12.2B~9.2 GB12 GB
Gemma 3 12B Instruct12.2B~8.9 GB12 GB
Qwen3 235B-A22B (MoE)235B~149.9 GB48 GB+
Command-R 35B35B~34.1 GB48 GB+

Check your GPU against these models in the calculator →

App compatibility

FeatureSupported
Local modelsYes
OpenRouterYes
OpenAI-compatible APIYes
OllamaYes
LM StudioYes
Anthropic APIYes
Google APIYes
Mistral APIYes
DockerNo
Works offlineYes
Needs GPUNo

Recommended models

Best local models

Best cloud/API models

Local vs cloud: which should you use?

Use local models if

  • You want privacy — data never leaves your machine
  • You already have a GPU with sufficient VRAM
  • You want zero per-token API costs
  • You need offline access

Use cloud/API if

  • Your GPU has insufficient VRAM for the models you need
  • You want access to frontier model quality
  • You need maximum coding/reasoning performance
  • You don't want to manage local model downloads and updates
  • OpenRouter lets you switch between 300+ models with one API key

Setup overview

Setting up SillyTavern is moderate in complexity. It runs on web, macos, linux, windows. Full documentation is available at https://docs.sillytavern.app.

Limitations

  • General productivity chat (use Open WebUI or LibreChat instead)
  • RAG or document Q&A (no built-in RAG)
  • Beginners — steep learning curve for advanced features

Related

Recommended GPUs

Compatible models

Related apps

Frequently asked questions

What is SillyTavern?
SillyTavern is Self-hosted chat interface for AI roleplay and creative writing. Deep character creation, lorebooks, group chats, and first-class OpenRouter support. SillyTavern is the community standard for AI roleplay with 10K+ GitHub stars.
Does SillyTavern need a GPU?
SillyTavern itself does not require a GPU. However, the models you connect to it do. SillyTavern itself has no GPU requirement — runs on a Raspberry Pi 4 with 2 GB RAM. All GPU requirements come from the model backend you connect. For local roleplay, 12 GB VRAM is sufficient for good 13B-class models.
Can I run SillyTavern on CPU only?
Yes — SillyTavern supports CPU-only operation, but performance will be significantly slower (5-10x) compared to GPU inference. CPU-only works best for models under 7B parameters with at least 16 GB of system RAM.
Can SillyTavern use OpenRouter?
Yes. SillyTavern supports OpenRouter for accessing 300+ models through a single API. Configure OpenRouter as a provider in SillyTavern's settings with your API key.
Can SillyTavern use local models via Ollama?
Yes. SillyTavern works with Ollama for running models locally. Install Ollama, pull your model (e.g., `ollama pull qwen2.5:7b`), and connect SillyTavern to the local Ollama server. GPU requirements depend on the model you choose, not SillyTavern itself.
What models work best with SillyTavern?
Models that work well with SillyTavern include: Qwen3 32B, Mistral Nemo 12B Instruct, Gemma 3 12B Instruct, Qwen3 235B-A22B (MoE), Command-R 35B, Gemma 4 26B (MoE). The best model depends on your GPU's VRAM and your use case.
Is SillyTavern free and open source?
Yes. SillyTavern is open source and completely free. You can find the source code on GitHub at https://github.com/SillyTavern/SillyTavern.