How It Works

Name: RelayPlane
Author: Matt Turley

Technical overview of the RelayPlane proxy architecture.

Architecture

┌─────────────┐     ┌──────────────────┐     ┌─────────────┐
│   Your AI   │────▶│  RelayPlane      │────▶│   LLM API   │
│    Tool     │     │     Proxy        │     │  (Anthropic │
│  (OpenClaw) │◀────│  localhost:4801  │◀────│   OpenAI)   │
└─────────────┘     └──────────────────┘     └─────────────┘
                           │
                           ▼
                    ┌──────────────┐
                    │  Task Type   │
                    │  Detection   │
                    │      +       │
                    │ Model Router │
                    └──────────────┘

Request Flow

Intercept: Your tool sends a request to the proxy (thinking it's the real API)
Analyze: Proxy examines the request (token counts, tool calls, patterns)
Classify: Determines task type (quick_task, code_review, generation, etc.)
Route: Selects optimal model based on task complexity
Forward: Sends request to the real API with selected model
Stream: Streams response back to your tool
Record: Logs anonymous telemetry (if enabled)

Task Detection

Tasks are classified based on request characteristics, never prompt content:

1// Simplified detection logic
2function detectTaskType(request: Request): TaskType {
3  const { tokensIn, tokensOut, hasTools } = analyze(request);
4  
5  if (hasTools && tokensOut < 500) return 'tool_use';
6  if (tokensIn < 500 && tokensOut < 500) return 'quick_task';
7  if (tokensIn > 10000) return 'long_context';
8  if (tokensOut / tokensIn > 2) return 'generation';
9  if (tokensOut / tokensIn < 0.3) return 'classification';
10  
11  return 'general';
12}

Model Selection

Each task type maps to an optimal model tier:

Task Type	Model	Reasoning
tool_use, quick_task	Haiku	Simple operations don't need deep reasoning
code_review, generation	Sonnet	Good balance of quality and cost
long_context, complex	Opus	Complex reasoning needs premium model

API Compatibility

The proxy is fully compatible with both Anthropic and OpenAI APIs:

Anthropic Messages API (/v1/messages)
OpenAI Chat Completions API (/v1/chat/completions)
Streaming responses supported
Tool/function calling supported

Zero Config: Just point your tool at the proxy. It handles API compatibility automatically.

Local Storage

All data is stored locally in ~/.relayplane/:

config.json — Settings and API key
stats.json — Usage statistics
telemetry/ — Queued telemetry (if enabled)