“I was spending $200/month on Claude API calls for my coding agent. After installing RelayPlane, it dropped to $87. Same quality output—it just stopped using Opus for simple tasks like file reads and status checks.”

Matt Turley
Founder, RelayPlane · 20 years building software
Yes, this is the founder's own story. Early user testimonials coming soon.
We use OpenClaw daily. This exists because we needed it.
Real-time visibility into every request, model selection, and dollar saved.

Track savings, monitor routing decisions, and optimize your agent fleet — all in one place.Available on Pro and Max plans.
Running a team of agents burns through credits fast.
Every file read, status check, and simple task hits your Opus quota—even when a cheaper model would work just fine.
The Swarm learns from anonymized patterns across all users—task types, token counts, success rates—never your actual prompts. Think Waze for API routing.
Looks at task type, token count, and complexity signals — never your actual prompts.
What worked for similar requests across all users? “Code review under 3K tokens succeeds 94% on Haiku.”
Chooses the cheapest model that meets your quality threshold — if you need 95% success, it won't risk 87%.
Network effect
Every user (free or paid) contributes anonymized success/fail data. More users = smarter routing for everyone. We're early—currently 50+ beta users contributing routing data.
This is the fear everyone has. Here's exactly what happens.
Worst case: You paid for both calls. Still cheaper than always using Opus.
Circuit breakers: If a model fails 5x in a row, we skip it automatically for 30 seconds.
The math still works: Even with occasional fallback calls, users save 50-80% because most simple tasks do succeed on cheaper models. You're not paying Opus prices for file reads.
Your proxy gets smarter over time—learning from your usage patterns, stored entirely on your machine.
Every run, every decision, every outcome
Task patterns
Code review, file reads, complex reasoning
Success rates
Which model worked for which task type
Timing data
Latency, time-to-first-token, retries
Cost trends
Spend by model, task type, time period
Pattern recognition, local-first
100% local. Zero cloud dependency.
SQLite database on your machine. Works offline. Yours to export.
Week 1
Rule-based routing
Week 4
Personalized routing
50%+ savings
Free tier, no cloud needed
Three steps. Zero code changes.
npm install -g @relayplane/proxyMIT licensed. Runs 100% locally.
export ANTHROPIC_BASE_URL=localhost:3001One environment variable. No code changes.
relayplane statsReal-time dashboard shows exactly what you're saving.
RelayPlane is a complete agent operations platform. Here's what's built and ready.
Complete control over which models can be used, when, and by whom. Set allowlists, blocklists, cost thresholds, and approval requirements. “Never use GPT-4 for customer data” — enforced automatically.
Included in Max tier
Know exactly why each call was routed where it was. Full explainability for compliance.
Require human approval for expensive operations. Multi-approver workflows.
Hard limits by day/week/month. Get alerts before you blow through spend.
Spots anomalies, cost spikes, and failure patterns. Suggests optimizations.
Multi-user access with role-based permissions. Shared learnings across team agents. Separate budgets per team.
Full dashboard UI included (53 components). This isn't vaporware.
See documentation →7-day free trial on Pro — no credit card required.
A complete, production-ready tool. No strings attached.
Break-even math: Pro pays for itself at $60/mo API spend. If you're spending $100+/mo, you'll save $20-50/mo after the subscription.
Your data stays on your machine. We only see anonymized metadata for routing decisions.
Anonymized metadata only
Zero access to your content
Verify it yourself
See exactly what gets sent
relayplane-proxy --auditThe TelemetryEvent interface shows exactly what's collected. Nothing else.
The Swarm only sees anonymized metadata: task type, token count, model used, success/fail. Never your prompts, code, or outputs. Think of it like Waze—everyone's anonymous driving data makes traffic predictions better for everyone. Your actual prompts never leave your machine.
RelayPlane automatically retries with a better model. You pay for both calls, but that's still cheaper than always using Opus. The Swarm learns from failures so routing improves over time.
Local proxy: <5ms. Swarm API call: 50-100ms. For a 2-5 second LLM call, this is negligible. If latency matters more than cost, configure more aggressive model selection.
Proxy falls back to static rules automatically. Your agent keeps working. This is baked into the architecture—we assume network calls can fail.
Yes. Run `relayplane-proxy --audit` to see every payload before it's sent. Or read the MIT-licensed source code on GitHub.
The local proxy works forever—it's MIT licensed software on your machine. Only the Swarm API would stop. You'd keep ~50% savings on static rules.
No. If RelayPlane can't route, it passes through to your default model. Worst case: you pay what you would have paid anyway.
Run `relayplane stats` or check the cloud dashboard. We show you exactly how much you've saved vs. what you would have spent.

Solo Founder · 20 years in software
I was spending over $200/month on LLM API credits running OpenClaw. RelayPlane cut that by 60%.
This is a real product from a real person, not a VC-funded growth hack. I've been independent for 16 years and I'll be around to fix bugs and answer questions.