“I was spending $200/month on Claude API calls for my coding agent. After installing RelayPlane, it dropped to $87. Same quality output—it just stopped using Opus for simple tasks like file reads and status checks.”

Matt Turley
Founder, RelayPlane · 20 years building software
Yes, this is the founder's own story. Early user testimonials coming soon.
Looks at task type, token count, and complexity signals — never your actual prompts.
What worked for similar requests across all users? “Code review under 3K tokens succeeds 94% on Haiku.”
Chooses the cheapest model that meets your quality threshold — if you need 95% success, it won't risk 87%.
Network effect
Every user (free or paid) contributes anonymized success/fail data. More users = smarter routing for everyone. We're early—currently 50+ beta users contributing routing data.
This is the fear everyone has. Here's exactly what happens.
Worst case: You paid for both calls. Still cheaper than always using Opus.
Circuit breakers: If a model fails 5x in a row, we skip it automatically for 30 seconds.
The math still works: Even with occasional fallback calls, users save 30-50% because most simple tasks do succeed on cheaper models. You're not paying Opus prices for file reads.
Three steps. Zero code changes.
npm install -g @relayplane/proxyMIT licensed. Runs 100% locally.
export ANTHROPIC_BASE_URL=localhost:3001One environment variable. No code changes.
relayplane statsReal-time dashboard shows exactly what you're saving.
RelayPlane is a complete agent operations platform. Here's what's built and ready.
Complete control over which models can be used, when, and by whom. Set allowlists, blocklists, cost thresholds, and approval requirements. “Never use GPT-4 for customer data” — enforced automatically.
Included in Max tier
Know exactly why each call was routed where it was. Full explainability for compliance.
Require human approval for expensive operations. Multi-approver workflows.
Hard limits by day/week/month. Get alerts before you blow through spend.
Spots anomalies, cost spikes, and failure patterns. Suggests optimizations.
Multi-user access with role-based permissions. Shared learnings across team agents. Separate budgets per team.
Full dashboard UI included (53 components). This isn't vaporware.
See documentation →7-day free trial on Pro — no credit card required.
For side projects and experimentation.
Break-even math: Pro pays for itself at $60/mo API spend. If you're spending $100+/mo, you'll save $20-50/mo after the subscription.
Your data stays on your machine. We only see anonymized metadata for routing decisions.
Anonymized metadata only
Zero access to your content
RelayPlane automatically retries with a better model. You pay for both calls, but that's still cheaper than always using Opus. The Swarm learns from failures so routing improves over time.
Local proxy: <5ms. Swarm API call: 50-100ms. For a 2-5 second LLM call, this is negligible. If latency matters more than cost, configure more aggressive model selection.
Proxy falls back to static rules automatically. Your agent keeps working. This is baked into the architecture—we assume network calls can fail.
Yes. Run `relayplane-proxy --audit` to see every payload before it's sent. Or read the MIT-licensed source code on GitHub.
The local proxy works forever—it's MIT licensed software on your machine. Only the Swarm API would stop. You'd keep ~30% savings on static rules.
No. If RelayPlane can't route, it passes through to your default model. Worst case: you pay what you would have paid anyway.
Run `relayplane stats` or check the cloud dashboard. We show you exactly how much you've saved vs. what you would have spent.

Solo Founder · 20 years in software
I built RelayPlane because I was spending $200/mo on Claude API calls and realized most of them didn't need Opus. Now I spend $87.
This is a real product from a real person, not a VC-funded growth hack. I've been independent for 16 years and I'll be around to fix bugs and answer questions.