Model Selection
How RelayPlane automatically selects the right model for each task to minimize costs without sacrificing quality.
Overview
RelayPlane analyzes each incoming request and classifies it by complexity. Simple tasks get routed to cheaper, faster models while complex tasks use premium models. This happens transparently — your tool doesn't need to change anything.
Task Classification
The proxy infers task type from the request content using pattern matching on the last user message. It recognizes 9 task types:
| Task Type | Description |
|---|---|
code_generation | Writing new code (implement, refactor, build) |
code_review | Reviewing or auditing code |
summarization | Condensing long content |
analysis | Analytical and evaluative tasks |
creative_writing | Stories, essays, articles |
data_extraction | Pulling structured data from text |
translation | Language translation |
question_answering | Direct Q&A |
general | Everything else |
Complexity Scoring
Each request is scored for complexity based on the last user message (not system prompts or conversation history). The scoring considers:
- Code indicators — code blocks, function/class definitions (+2)
- Analytical tasks — analyze, compare, evaluate, review (+2)
- Math/logic — calculate, solve, prove, derive (+2)
- Multi-step reasoning — step-by-step instructions (+2)
- Architecture/design — system design, distributed systems (+3)
- Implementation requests — implement, refactor, debug, optimize (+2)
- Token length — longer messages score higher (+1 to +4)
- Multiple requirements — many "and" conjunctions (+1 to +2)
The score maps to three complexity levels:
- Simple (score < 2) → cheapest model
- Moderate (score 2-3) → mid-tier model
- Complex (score ≥ 4) → premium model
Default Model Mapping
The default complexity-to-model mapping (configurable in ~/.relayplane/config.json):
1{2 "routing": {3 "complexity": {4 "enabled": true,5 "simple": "claude-sonnet-4-6",6 "moderate": "claude-sonnet-4-6",7 "complex": "claude-opus-4-6"8 }9 }10}Routing Modes
You can request a specific routing strategy using model aliases:
| Alias | Behavior |
|---|---|
rp:best | Always use highest quality model |
rp:fast | Use fastest/cheapest model |
rp:cheap | Use cheapest model available |
rp:balanced | Complexity-based routing (default auto) |
relayplane:auto | Same as rp:balanced |
Cascade Mode
When routing.mode is set to "cascade", the proxy starts with a cheaper model and escalates to a more expensive one if the response shows signs of uncertainty or refusal:
1{2 "routing": {3 "mode": "cascade",4 "cascade": {5 "enabled": true,6 "models": ["claude-sonnet-4-6", "claude-opus-4-6"],7 "escalateOn": "uncertainty",8 "maxEscalations": 19 }10 }11}routing.mode to "auto" to always route based on complexity, even when the client sends a specific model name. This is the recommended setting for maximum savings.