Continue is the most popular open-source AI coding extension, with 3M+ VS Code installs. It supports chat, tab autocomplete, and inline edits through a fully configurable provider system.
Continue connects to any OpenAI-compatible endpoint via apiBase in its config. Set that to RelayPlane and every model request gets cost-tracked and intelligently routed.
Works with Continue's built-in provider config. No extension changes required.
{
"models": [
{
"title": "RelayPlane Auto",
"provider": "openai",
"model": "relayplane:auto",
"apiBase": "http://localhost:4801/v1",
"apiKey": "your-api-key"
}
]
}Full config: ~/.continue/config.json
{
"models": [
{
"title": "RelayPlane Auto",
"provider": "openai",
"model": "relayplane:auto",
"apiBase": "http://localhost:4801/v1",
"apiKey": "your-api-key"
},
{
"title": "RelayPlane Fast",
"provider": "openai",
"model": "rp:fast",
"apiBase": "http://localhost:4801/v1",
"apiKey": "your-api-key"
},
{
"title": "Claude Sonnet (via RelayPlane)",
"provider": "openai",
"model": "claude-sonnet-4-6",
"apiBase": "http://localhost:4801/v1",
"apiKey": "your-api-key"
}
],
"tabAutocompleteModel": {
"title": "RelayPlane Fast",
"provider": "openai",
"model": "rp:fast",
"apiBase": "http://localhost:4801/v1",
"apiKey": "your-api-key"
}
}Smart routing mode. Analyzes prompt complexity and routes to the optimal model automatically.
Always routes to the fastest, cheapest model. Ideal for tab autocomplete where latency matters.
Set this separately to use a fast model for autocomplete while using smarter routing for chat.
Use the openai provider in Continue. RelayPlane exposes a standard OpenAI-compatible API.
POST /v1/chat/completionsContinue sends requests to the configured apiBase. With RelayPlane set, calls go through the local proxy.
-> claude-3-5-haikuAnalyzes prompt complexity. Tab autocomplete and simple questions go to Haiku. Chat and refactors go to Sonnet or Opus.
<- SSE streamRelayPlane forwards to the optimal model and streams the response back to Continue. Completely transparent.
relayplane:autoInfers task type from prompt. Tab completions go to Haiku. Chat and complex edits go to Sonnet or Opus.
autocomplete -> haiku chat -> sonnet
rp:fastAlways routes to the lowest-latency model. Best for tab autocomplete.
everything -> haiku # lowest latency
relayplane:qualityRoutes to the best model for each task. Maximum quality, higher cost.
everything -> opus # full quality mode
RelayPlane adds intelligent routing and observability to every Continue request.
relayplane stats --days 7See exactly what each session costs
relayplane stats --breakdownCost by model, by task type, by hour
model: "rp:fast"Dedicated fast model for tab completions
model: "relayplane:auto"Smarter routing for chat messages
~/.relayplane/data.dbAll logs stored locally in SQLite
telemetry: off (default)No data sent anywhere without opt-in
claude-sonnet-4-6Use specific models by name via RelayPlane
provider: openai (standard)No special Continue plugin needed
Start it with verbose mode to see what is happening:
npx @relayplane/proxy --port 4801 -v
Check that apiBase is set correctly in ~/.continue/config.json:
"apiBase": "http://localhost:4801/v1"
Check the stats endpoint while Continue is active:
curl http://localhost:4801/control/stats
The apiKey in config.json is passed through to the proxy. Make sure it matches your provider key:
export ANTHROPIC_API_KEY="sk-ant-..." # apiKey in config.json should match or be "any" (proxy handles auth)
Set tabAutocompleteModel to use the rp:fast mode for lower latency:
"tabAutocompleteModel": {
"model": "rp:fast",
"apiBase": "http://localhost:4801/v1"
}