Semantic Routing
An optional local sidecar that classifies each prompt with a fine-tuned language model, routing to the right model with higher accuracy than regex patterns.
What It Is
By default, RelayPlane uses regex heuristics to classify prompts (chat, completion, code). The semantic router replaces that with a local ModernBERT-base model (Apache 2.0) that classifies task type and recommends the best model from your available pool. The proxy falls back to regex automatically if the sidecar is unreachable, times out, or returns low-confidence results.
The sidecar runs entirely on your machine. No prompts leave your network.
Requirements
- Node 18+ for the proxy (built-in
fetchandAbortController) - A running sidecar that exposes
POST /v1/routeandGET /health
Start the Sidecar
The reference sidecar is built on ModernBERT-base. Start it locally:
Docker
1docker run -p 8888:8888 relayplane/semantic-router-sidecar:latestpip
1pip install relayplane-sidecar2relayplane-sidecar --port 8888Configure the Proxy
Set these environment variables before starting RelayPlane:
| Variable | Default | Description |
|---|---|---|
RELAYPLANE_SIDECAR_URL | (unset) | Base URL of the sidecar, e.g. http://localhost:8888. If unset, the sidecar is disabled and regex classification is used. |
RELAYPLANE_SIDECAR_CONFIDENCE_THRESHOLD | 0.65 | Minimum confidence (0-1) to accept a sidecar result. Results below this threshold fall back to regex. |
RELAYPLANE_SIDECAR_TIMEOUT_MS | 200 | Request timeout in milliseconds. Clamped to [50, 2000]. |
1export RELAYPLANE_SIDECAR_URL=http://localhost:88882export RELAYPLANE_SIDECAR_CONFIDENCE_THRESHOLD=0.703export RELAYPLANE_SIDECAR_TIMEOUT_MS=1504relayplane startFallback Behavior
The proxy silently falls back to regex classification when:
RELAYPLANE_SIDECAR_URLis not set- The sidecar is unreachable at startup or request time
- The HTTP request times out
- The response is malformed or missing required fields
- The returned confidence is below
RELAYPLANE_SIDECAR_CONFIDENCE_THRESHOLD
No configuration is required to enable fallback. It is always active.
Observability
Each captured knowledge atom in ~/.relayplane/osmosis.db includes:
classifierSource:'regex'or'sidecar'classifierConfidence: confidence score when the sidecar was usedclassifierRecommendedModel: model recommended by the sidecar
Compare routing quality over time by querying classifier_source in the database.
RELAYPLANE_SIDECAR_URL is set.