Question 1

What is SillyTavern?

Accepted Answer

SillyTavern is Self-hosted chat interface for AI roleplay and creative writing. Deep character creation, lorebooks, group chats, and first-class OpenRouter support. SillyTavern is the community standard for AI roleplay with 10K+ GitHub stars.

Question 2

Does SillyTavern need a GPU?

Accepted Answer

SillyTavern itself does not require a GPU. However, the models you connect to it do. SillyTavern itself has no GPU requirement — runs on a Raspberry Pi 4 with 2 GB RAM. All GPU requirements come from the model backend you connect. For local roleplay, 12 GB VRAM is sufficient for good 13B-class models.

Question 3

Can I run SillyTavern on CPU only?

Accepted Answer

Yes — SillyTavern supports CPU-only operation, but performance will be significantly slower (5-10x) compared to GPU inference. CPU-only works best for models under 7B parameters with at least 16 GB of system RAM.

Question 4

Can SillyTavern use OpenRouter?

Accepted Answer

Yes. SillyTavern supports OpenRouter for accessing 300+ models through a single API. Configure OpenRouter as a provider in SillyTavern's settings with your API key.

Question 5

Can SillyTavern use local models via Ollama?

Accepted Answer

Yes. SillyTavern works with Ollama for running models locally. Install Ollama, pull your model (e.g., `ollama pull qwen2.5:7b`), and connect SillyTavern to the local Ollama server. GPU requirements depend on the model you choose, not SillyTavern itself.

Question 6

What models work best with SillyTavern?

Accepted Answer

Models that work well with SillyTavern include: Qwen3 32B, Mistral Nemo 12B Instruct, Gemma 3 12B Instruct, Qwen3 235B-A22B (MoE), Command-R 35B, Gemma 4 26B (MoE). The best model depends on your GPU's VRAM and your use case.

Question 7

Is SillyTavern free and open source?

Accepted Answer

Yes. SillyTavern is open source and completely free. You can find the source code on GitHub at https://github.com/SillyTavern/SillyTavern.

Model	Params	Q4 VRAM	Min GPU
Qwen3 32B	32.8B	~22.2 GB	24 GB
Mistral Nemo 12B Instruct	12.2B	~9.2 GB	12 GB
Gemma 3 12B Instruct	12.2B	~8.9 GB	12 GB
Qwen3 235B-A22B (MoE)	235B	~149.9 GB	48 GB+
Command-R 35B	35B	~34.1 GB	48 GB+

Feature	Supported
Local models	Yes
OpenRouter	Yes
OpenAI-compatible API	Yes
Ollama	Yes
LM Studio	Yes
Anthropic API	Yes
Google API	Yes
Mistral API	Yes
Docker	No
Works offline	Yes
Needs GPU	No

SillyTavern

Can it run on my hardware?

App compatibility

Recommended models

Best local models

Best cloud/API models

Local vs cloud: which should you use?

Use local models if

Use cloud/API if

Setup overview

Limitations

Related

Recommended GPUs

Compatible models

Related apps

Frequently asked questions