BUILT FOR OPENCLAW · CLAUDE CODE · CURSOR · ANY AI AGENT

Your agents are burning money.
Now you can see why.

RelayPlane sits between your OpenClaw agents and the LLM providers they call. It tracks every dollar, shows you exactly where it goes, and gives you the tools to cut waste.

Install Free Open Source on GitHub

OPEN SOURCE · MIT LICENSE · FREE FOREVER · 101 GITHUB STARS

🔒 relayplane.com/dashboard

Runs

Monitor and analyze agent runs in real-time

Live↻ Refresh

Total Runs

↗ +14.5% vs last week

Total Cost

↗ +11.2% vs last week

Avg Latency

0ms

↘ -12.8% vs last week

Success Rate

↗ +1.5% vs last week

#a3f7ccoder-mainclaude-opus · 5m ago$0.471.2s

#b8e2dcoder-mainclaude-sonnet · 12m ago$0.08623ms

#c4f9aci-agentgpt-4o · 25m ago$0.0430.0s

#d7e1bresearch-agentclaude-opus · 45m ago$0.000ms

#e2c8fcoder-mainclaude-opus · 30s ago$0.240ms

Analytics

Cost breakdown, cache savings, and savings analysis

Savings This Week

$168.40

saved vs. unoptimized routing

Cost by Model

Opus $56

Sonnet $41

Haiku $32

GPT-4o $18

Cost per Model (7 days)

claude-opus$56.20

claude-sonnet$41.10

claude-haiku$31.80

gpt-4o$18.13

Routing

Live routing decisions

Read file: src/utils/config.ts

Simple→Haiku→saved $0.14

2s ago

Generate unit tests for auth module

Medium→Sonnet→saved $0.08

18s ago

Refactor database schema for multi-tenancy

Complex→Opus→correct tier

34s ago

Format JSON output from API response

Simple→Haiku→saved $0.12

1m ago

Implement WebSocket reconnection logic

Medium→Sonnet→saved $0.11

2m ago

Policies

Active routing rules

if task = SIMPLE then route → haikuFile reads, small edits, simple questions

Active · matched 847 times today

if task = MEDIUM then route → sonnetCode generation, refactoring, test writing

Active · matched 312 times today

if context > 50K tokens then route → gemini-proLarge context window: cheaper per token

Active · matched 88 times today

if monthly_spend > $200 then alert via telegramBudget guardrail: notify before overspend

Active · budget: $147.23 / $200

THE PROBLEM

The real problem isn't access.
It's model mismatch.

Most teams are overpaying because simple work gets sent to premium models by default, especially when no routing policy is in place.

The core issue: expensive models are used for everything. Code formatting, file reads, simple Q&A, and other low-complexity tasks often do not need top-tier models.

73% of typical agent spend is on the wrong model for the task.

Before RelayPlane

Simple tasks (file reads, edits)Opus$142/moWaste

Medium tasks (code generation)Opus$38/moOverspend

Complex tasks (architecture)Opus$18/moCorrect

Retried failuresOpus$24/moPreventable

Without RelayPlane

$222/mo

→

With RelayPlane

$52/mo

You save

$170/mo

77% less

How it works

Simple tasksOpus → Haiku$142 → $12✓

Medium tasksOpus → Sonnet$38 → $22✓

Complex tasksOpus → Opus$18 → $18No change

Retried failuresEliminated$24 → $0Avoided

How much could you save?

Enter your monthly API spend

$/month

Free tier (configurable routing)

−$150

Contributor tier (mesh routing, coming soon)

−$300

Net savings: +$300/mo

HOW IT WORKS

Three pillars. One install.

Pillar 1 · Observe

See where every dollar goes

Every LLM request flows through RelayPlane. Cost per request, model used, task type, tokens consumed, latency: all tracked automatically. Cache-aware cost tracking shows the real impact of prompt caching. See cache hits, creation costs, and actual savings per request. Per-agent cost tracking identifies each agent by system prompt fingerprint so you know exactly which agent is burning money. Your first "aha" moment: seeing that 73% of your spend is on the wrong model.

Cost per request
Model breakdown by provider
Cache-aware cost tracking
Per-agent cost breakdown
Task classification
Latency tracking

→

Pillar 2 · Govern

Control costs before they spiral

A layered policy engine routes, limits, and protects every request. Task-aware routing uses heuristic classification to send simple tasks to cheap models and complex tasks to capable ones. Budget enforcement blocks or downgrades requests when daily/hourly limits are hit. Anomaly detection catches runaway loops and cost spikes in real time. Response caching skips API calls entirely for identical requests. Default behavior is passthrough. You configure each layer.

Heuristic task classification & cascade routing
Daily / hourly / per-request budget limits
Auto-downgrade to cheaper models at budget threshold
Anomaly detection: velocity spikes, repetition loops, token explosions
Cost alerts & webhook delivery
Exact-match response cache with gzipped disk persistence
Per-model rate limiting (Opus: 10 rpm, default: 60 rpm)

→

Pillar 3 · Learn

Every agent makes yours smarter

The mesh is on by default. Anonymized routing outcomes are shared with other agents on the network. Every success and failure becomes a signal. As the mesh grows, routing gets smarter for everyone, automatically, with no prompts or code.

Collective routing intelligence
Model effectiveness scores
Failure pattern detection
Network effect: more agents, better routing

GET STARTED

One command. Three minutes.

npm install -g @relayplane/proxy && relayplane init && relayplane start

Works with OpenClaw, Claude Code, Cursor, and any agent framework.
Point your agent at localhost:4100 via ANTHROPIC_BASE_URL or OPENAI_BASE_URL. No risk: if RelayPlane goes down, your agent keeps working.
Run relayplane stats in your terminal for a quick cost summary.

Step 1

Install

npm install

Step 2

Init config

relayplane init

Step 3

Start proxy

relayplane start

Step 4

See savings

localhost:4100

Supports: Anthropic · OpenAI · Google Gemini · xAI/Grok · OpenRouter · DeepSeek · Groq · Mistral · Together · Fireworks · Perplexity

ZERO RISK

If RelayPlane crashes,
your agent doesn't notice.

We learned this the hard way. Early versions hijacked provider URLs. One crash took everything down for 8 hours. Never again.

RelayPlane uses a circuit breaker architecture. After 3 failures, all traffic bypasses the proxy automatically. Your agent talks directly to the provider. When RelayPlane recovers, traffic resumes. No manual intervention.

Closed

Normal operation

→

Open

After 3 failures

→

Half-Open

Probe recovery

→

Closed

Recovered

THE NETWORK

Routing that gets smarter
from every user.

COMING SOON

By default, RelayPlane reports anonymized routing outcomes: "task type X + model Y = success/failure." No prompts. No code. No user data. Just operational signals. The data below is projected, not live.

Routing

Claude Haiku handles file read tasks with 99.2% success rate across 12,847 routing decisions · avg cost $0.002/request

confidence: 0.97 · 12,847 data points

Warning

GPT-4o-mini drops below 80% accuracy for structured JSON output above 30K context tokens

1,203 data points · detected 3 days ago

Pattern

Teams that route code review to Sonnet instead of Opus save 68% with no measurable quality difference

847 teams · 94% confidence

🔒 Never collected, even with telemetry on

Your prompts or conversationsAPI keys or credentialsFile contents or codePersonal informationBusiness logic or dataModel responses or outputs

Telemetry is on by default. Only anonymous metadata is collected: task type label, token counts, model, latency, cost. Disable: relayplane telemetry off

PRICING

Your agent is the customer.

Free

forever

Full proxy + local dashboard + all core intelligence. No gates, ever.

Full proxy (unlimited requests)
Local dashboard (localhost:4100)
Cloud dashboard (relayplane.com)
Budget enforcement
Anomaly detection
Response cache
Auto-downgrade
Cost alerts
7-day local history
Mesh network: provider health alerts
MIT license

Install Free

npm install -g @relayplane/proxy

Pro

$19/org/mo

Network intelligence from thousands of agents. Your agent learns from the collective.

Everything in Free
90-day cloud history
Network-enhanced pre-run estimation
Full mesh intelligence: routing from thousands of agents
Full REST API access
Data export (CSV/JSON)
Priority support

Get Pro

Network intelligence. Pays for itself fast.

Fleet

$49/org/mo

Fleet coordination: your agents learn from each other in a private mesh.

Everything in Pro
Private fleet mesh
Fleet-wide pre-run estimation
Per-agent spend limits
Approval flows
Audit logs
Unlimited history
Dedicated support

For organizations running agent fleets.

FeatureFreeProFleet

Full proxy (unlimited)✓✓✓

Local dashboard✓✓✓

Cloud dashboard✓✓✓

Budget enforcement✓✓✓

Anomaly detection✓✓✓

Response cache✓✓✓

Auto-downgrade✓✓✓

Cost alerts✓✓✓

History7 days90 daysUnlimited

Mesh network (contribute + provider health)✓✓✓

Mesh intelligence (routing, cost estimates)—✓✓

Data export—✓✓

REST API access—✓✓

Private fleet mesh——✓

Per-agent spend limits——✓

Approval flows & audit logs——✓

THE DATA

Built on pain we measured.

211K

OpenClaw ecosystem
GitHub stars

582K

Views on the
API cost crisis

$35–720

Monthly user
spend range

77%

Cost reduction
documented

"I was mass spending $200+/month running an agent swarm and had zero visibility into where the money was going. Turns out 73% of my requests were using Opus for tasks Haiku could handle."Matt Turley, Continuum

FAQ

Common questions

Will this break my OpenClaw setup?

No. RelayPlane uses a circuit breaker architecture. If the proxy fails for any reason, all traffic automatically bypasses it and goes directly to your LLM provider. Your agent doesn't even notice. If RelayPlane can't route, it passes through to your default model. Worst case: you pay what you would have paid anyway. We learned this lesson the hard way and built the safety model first.

What data do you collect?

Telemetry is on by default. You can disable it (relayplane telemetry off). We collect anonymized metadata: task type label, token count, model used, latency, estimated cost, and an anonymous device ID. Your prompts, code, and responses are never collected. They go directly to LLM providers.

How does task-aware routing work?

RelayPlane uses heuristic classification (token counts, keyword patterns, code block detection) to label requests by complexity: simple, moderate, or complex. When you enable routing, it maps these labels to models, e.g., simple to Haiku, moderate to Sonnet, complex to Opus. Default behavior is passthrough (no routing). You configure the rules.

Does it work with models other than Anthropic?

Yes. RelayPlane supports any OpenAI-compatible API: Claude, GPT-4o, Gemini, Mistral, open-source models via OpenRouter. The routing engine is model-agnostic.

How much will I actually save?

It depends on your usage pattern, but most users see 40-70% cost reduction. The biggest savings come from routing simple tasks (which are typically 60-70% of all requests) to cheaper models. You'll see exact numbers in your dashboard within the first hour.

How does RelayPlane compare to OpenRouter?

Different layer entirely. OpenRouter is a multi-provider gateway: you pick the model, it routes to the cheapest provider for that model. RelayPlane picks the right model for the task. RelayPlane is local-first (your machine, your data). OpenRouter is a cloud service you send all prompts through. They're complementary: you can use OpenRouter as a provider behind RelayPlane. RelayPlane adds cost tracking, task classification, and a local dashboard on top.

How does RelayPlane compare to LiteLLM?

LiteLLM is a unified API adapter: call any provider with one SDK. RelayPlane does that and adds configurable task-aware routing, cost tracking, and a dashboard. LiteLLM requires code changes (import litellm). RelayPlane is a proxy: you set ANTHROPIC_BASE_URL to point at it. LiteLLM is a library you integrate. RelayPlane is infrastructure you deploy.

How does the task classification work?

Two-layer pre-flight classification with zero latency overhead (no LLM calls). First, task type: regex pattern matching on the prompt across 9 categories (code generation, summarization, analysis, etc.), under 5ms. Second, complexity: structural signals like code blocks, token count, and multi-step instructions scored as simple, moderate, or complex. Routing rules map task type + complexity to a model tier. Cascade mode starts with the cheapest model and auto-escalates on uncertainty or refusal patterns.

Is my data sent anywhere?

Your prompts and responses go directly to LLM providers, never through RelayPlane servers. Telemetry (anonymous metadata: task type, token counts, model, cost) is on by default. Disable it with relayplane telemetry off. MIT licensed, fully auditable.

What happens if I stop using it?

Nothing. Remove the proxy, your agents talk directly to providers again. No lock-in, no migration, no data hostage. It's MIT licensed. You can fork it and run your own if you want.

What if a cheaper model returns bad results?

RelayPlane automatically retries with a better model. You pay for both calls, but that's still cheaper than always using Opus. Collective failure learning is on the roadmap. For now, the proxy logs these so you can adjust your routing rules.

What if RelayPlane shuts down?

The local proxy works forever. It's MIT licensed software on your machine. Only the mesh network would stop. You'd keep ~30% savings on static rules.

How do I know I'm actually saving money?

Run relayplane stats or check the dashboard. We show you exactly how much you've saved vs. what you would have spent.

Stop paying for your agent's mistakes.

Install RelayPlane. See your costs in 3 minutes.