How to Cut OpenClaw API Costs with RelayPlane Proxy
If you're running OpenClaw agents, you already know the drill: your API bill climbs every month, and most of the spend is wasteful. The average OpenClaw setup burns $100-300/mo on API calls — and the majority of those requests don't need an expensive model. Here's how to fix that.
What Is an AI Proxy (And Why Should You Care)?
An AI proxy sits between your application and your AI providers. Every API call passes through it. Instead of blindly forwarding requests to whatever model you configured, the proxy inspects the task, classifies its complexity, and routes it to the right model.
This isn't a load balancer. It's a cost intelligence layer.
When OpenClaw sends a request to ANTHROPIC_BASE_URL or OPENAI_BASE_URL, the proxy intercepts it. Simple task like reading a config file? Route to Haiku or GPT-4o-mini at a fraction of the cost. Complex debugging session? Send it to Opus. The routing is rule-based and deterministic — you define the complexity mappings, the proxy enforces them on every request.
Cost Breakdown: Before vs. After Proxy Routing
Here's a typical OpenClaw agent workload breakdown:
| Task Type | % of Requests | Examples | Without Proxy | With Proxy |
|---|---|---|---|---|
| Simple | ~60% | File reads, status checks, formatting, short rewrites | Opus/GPT-4o ($$$) | Haiku/GPT-4o-mini ($) |
| Medium | ~25% | Code review, summaries, data extraction | Opus/GPT-4o ($$$) | Sonnet/GPT-4o ($$) |
| Complex | ~15% | Architecture decisions, hard debugging, long-form creative | Opus/GPT-4o ($$$) | Opus/GPT-4o ($$$) |
The math: if you're spending $200/mo, roughly $120 of that is simple tasks hitting expensive models. Route those to Haiku-class models and that $120 drops to about $10-15. Medium tasks go from $50 to $30. Complex tasks stay the same. $200/mo becomes $55-75/mo. Your actual savings depend on your workload mix.
Setting Up RelayPlane as Your OpenClaw Proxy
This takes about 5 minutes. Four steps.
Install
npm install -g @relayplane/proxyRequires Node.js 18+. That's the only dependency.
Initialize Configuration
relayplane initThis creates your config file with sensible defaults. It'll ask which providers you use and generate the routing rules. The default setup uses complexity-based routing with three tiers:
routing:
mode: complexity
models:
simple: anthropic/claude-3-haiku
medium: anthropic/claude-3.5-sonnet
complex: anthropic/claude-3-opus
cascade:
enabled: true
strategy: cost # try cheapest first, fall back on failureStart the Proxy
relayplane startThe proxy starts on localhost:4100 by default. Now point OpenClaw at the proxy by setting these environment variables:
export ANTHROPIC_BASE_URL=http://localhost:4100/v1export OPENAI_BASE_URL=http://localhost:4100/v1Verify It's Working
relayplane statsThis shows you a breakdown of requests by model, provider, cost, and routing decisions. Run a few OpenClaw tasks and check the stats to confirm requests are being routed to different models based on complexity.
Advanced: Multi-Provider Routing
The real power shows up when you route different task types to entirely different providers:
routing:
mode: complexity
models:
simple: groq/llama-3-8b # blazing fast, nearly free
medium: anthropic/claude-3.5-sonnet # strong all-rounder
complex: openai/gpt-4-turbo # heavy reasoning
cascade:
enabled: true
strategy: cost
fallback:
- anthropic/claude-3-opus
- openai/gpt-4oSet budget caps with the budget command:
relayplane budget --daily 10 --action warnrelayplane budget --hourly 2 --action blockBudget enforcement supports four actions: block (hard stop), warn (log and continue), downgrade (force cheaper model), and alert (notification). RelayPlane also includes anomaly detection that catches cost spikes, token explosions, and runaway loops automatically.
Privacy and Telemetry
RelayPlane runs entirely on your machine. Your prompts, responses, and API keys never leave localhost. There is no cloud component in the free tier — the proxy is a local process, the dashboard is a local web server.
Telemetry is on by default. It sends anonymous aggregate data: token counts, latency measurements, and model usage patterns. No prompt content, no API keys, no personally identifiable information.
relayplane telemetry # see current statusrelayplane telemetry off # disable completelyResults and Next Steps
Once the proxy is running, check your dashboard after a day or two of normal usage. Typical results: 40-70% cost reduction depending on your workload mix. Agents that do a lot of simple tasks see the highest savings.
Start saving in 30 seconds
Free tier. No credit card. Unlimited requests. Your API keys stay on your machine.
npm install -g @relayplane/proxy