CanItRun Logocanitrun.
← All apps

Continue

Open-source AI code assistant for VS Code and JetBrains. Tab autocomplete, chat, and agent mode with separate models per role — like a local Copilot.

App type

Coding Agent, Developer Tool

Local models

Yes

OpenRouter

Yes

Ollama

Yes

GPU required

No — runs in the cloud

Best for

Copilot-like autocomplete with local models for privacy

Setup difficulty

Medium

Platforms

VS Code, JetBrains

Pricing

Open source — free

Continue is Open-source AI code assistant for VS Code and JetBrains. Tab autocomplete, chat, and agent mode with separate models per role — like a local Copilot. Continue (31K+ GitHub stars) is an open-source AI code assistant that integrates deeply into VS Code and JetBrains.

Continue works with both local models and cloud APIs. It supports OpenRouter for unified access to 300+ models from a single API. Ollama integration lets you run models locally on your own GPU. Continue is open source (https://github.com/continuedev/continue), so you can inspect the code and self-host. 8 GB VRAM for 7B autocomplete/chat models. 16 GB for 14B agent mode. Agent mode with local models requires explicit tool_use capability config.

Can it run on my hardware?

Minimum

8 GB VRAM for 7B autocomplete/chat models. 16 GB for 14B agent mode. Agent mode with local models requires explicit tool_use capability config.

Recommended

16 GB VRAM for Qwen3-14B at Q4 as agent model. 24 GB+ for Qwen3.5-35B-A3B MoE. Consider using a small local model for autocomplete and a cloud model via OpenRouter for agent tasks.

Approximate VRAM needed for recommended local models at Q4 with 8K context:

ModelParamsQ4 VRAMMin GPU
Qwen 3.5 35B-A3B (MoE)35B~23.0 GB24 GB
Qwen3 32B32.8B~22.2 GB24 GB
Gemma 4 26B (MoE)26B~18.0 GB24 GB
Qwen3 8B8B~6.4 GB8 GB
Qwen3 14B14.8B~10.8 GB12 GB

Check your GPU against these models in the calculator →

App compatibility

FeatureSupported
Local modelsYes
OpenRouterYes
OpenAI-compatible APIYes
OllamaYes
LM StudioYes
Anthropic APIYes
Google APIYes
Mistral APINo
DockerNo
Works offlineNo
Needs GPUNo

Recommended models

Best local models

Best cloud/API models

Local vs cloud: which should you use?

Use local models if

  • You want privacy — data never leaves your machine
  • You already have a GPU with sufficient VRAM
  • You want zero per-token API costs
  • You need offline access

Use cloud/API if

  • Your GPU has insufficient VRAM for the models you need
  • You want access to frontier model quality
  • You need maximum coding/reasoning performance
  • You don't want to manage local model downloads and updates
  • OpenRouter lets you switch between 300+ models with one API key

Setup overview

Setting up Continue is moderate in complexity. It runs on vscode, jetbrains. Full documentation is available at https://docs.continue.dev.

Limitations

  • Autonomous multi-step agentic coding (use Cline/Roo Code)
  • Reliable local model agent mode (immature — use Aider instead)
  • Beginners — config.json editing required

Related

Recommended GPUs

Compatible models

Related apps

Frequently asked questions

What is Continue?
Continue is Open-source AI code assistant for VS Code and JetBrains. Tab autocomplete, chat, and agent mode with separate models per role — like a local Copilot. Continue (31K+ GitHub stars) is an open-source AI code assistant that integrates deeply into VS Code and JetBrains.
Does Continue need a GPU?
Continue itself does not require a GPU. However, the models you connect to it do. 8 GB VRAM for 7B autocomplete/chat models. 16 GB for 14B agent mode. Agent mode with local models requires explicit tool_use capability config.
Can I run Continue on CPU only?
Yes — Continue supports CPU-only operation, but performance will be significantly slower (5-10x) compared to GPU inference. CPU-only works best for models under 7B parameters with at least 16 GB of system RAM.
Can Continue use OpenRouter?
Yes. Continue supports OpenRouter for accessing 300+ models through a single API. Configure OpenRouter as a provider in Continue's settings with your API key.
Can Continue use local models via Ollama?
Yes. Continue works with Ollama for running models locally. Install Ollama, pull your model (e.g., `ollama pull qwen2.5:7b`), and connect Continue to the local Ollama server. GPU requirements depend on the model you choose, not Continue itself.
What is the best local model for Continue?
For Continue, the community-verified best local model is Qwen 3.5 35B-A3B (MoE). 16 GB VRAM for Qwen3-14B at Q4 as agent model. 24 GB+ for Qwen3.5-35B-A3B MoE. Consider using a small local model for autocomplete and a cloud model via OpenRouter for agent tasks.
Can I run Continue on 12 GB VRAM?
12 GB VRAM is generally not sufficient for serious agentic coding with Continue. You can run smaller models (7B-14B at Q4) but tool-calling reliability and context handling will be limited. For the best experience, 24 GB VRAM (RTX 3090/4090) is the community-recommended minimum for local agentic coding.
Is Continue free and open source?
Yes. Continue is open source and completely free. You can find the source code on GitHub at https://github.com/continuedev/continue.