RelayPlane is an npm-native Node.js LLM proxy. Install with npm install @relayplane/proxy, point your ANTHROPIC_BASE_URL or OPENAI_BASE_URL at http://localhost:4100, and get per-request cost tracking, complexity-based model routing, and Ollama local fallback, no Docker required.

How does RelayPlane compare to LiteLLM?

LiteLLM is a Python library with a proxy server option. RelayPlane is npm-native for Node.js developers, no Python, no Docker, just npm install. RelayPlane has per-request cost tracking built in and integrates directly with Claude Code, Cursor, and OpenClaw via a simple base URL override.

How does RelayPlane compare to Portkey or Helicone?

Portkey and Helicone offer cloud observability and are npm-compatible. RelayPlane runs entirely locally (no cloud required), is open source under MIT, and focuses on cost tracking plus complexity-based routing for Node.js developers using Claude Code, Cursor, and OpenClaw.

Does RelayPlane work with Claude Code and Cursor?

Yes. RelayPlane is designed for Claude Code, Cursor, and any tool that supports ANTHROPIC_BASE_URL or OPENAI_BASE_URL. Set ANTHROPIC_BASE_URL=http://localhost:4100 and RelayPlane intercepts every request, tracks costs, and routes to cheaper models for simple tasks.

Yes, completely. RelayPlane is free and open source (MIT). Every feature is included: full proxy, local and cloud dashboard, all 11 providers, budget enforcement, anomaly detection, hard cost caps, auto-kill, and 90-day history. It runs locally on your machine, your keys and prompts never leave your box. No tiers, no paywalls, no credit card. See relayplane.com/pricing.

How do I install the RelayPlane npm LLM proxy?

Run: npm install -g @relayplane/proxy. Then: relayplane init && relayplane start. The local dashboard runs at http://localhost:4100. Set ANTHROPIC_BASE_URL=http://localhost:4100 in your environment and all Claude requests route through RelayPlane automatically.

relayplaneopenclawapi-costsai-agentstutorial

OpenClaw API Costs: How to Stop Overpaying for AI in Your Agent

Name: RelayPlane
Author: Matt Turley

Matt Turley·March 9, 2026·6 min read

Running an OpenClaw-based agent and watching the API bill creep up? I've been there. I run a swarm of agents built on OpenClaw and the costs were adding up faster than I expected. Here's where the money goes and how I cut it without changing a line of application code.

1. Where OpenClaw API Costs Come From

OpenClaw makes a lot more API calls than you might expect. Every user message, every tool call, every subagent spawn, every heartbeat check, each one hits the Anthropic API. In a busy session with tool use and subagents, a single user conversation can generate dozens of API calls.

The default model in most OpenClaw setups skews toward capable (and expensive) models. That makes sense for the main orchestration logic, but a lot of those calls don't actually need Sonnet or Opus. Tool selection, routing decisions, simple confirmations, memory reads, these are tasks where a cheaper model gets the job done.

The problem is OpenClaw sends everything to the same endpoint with the same model unless you set up routing. By default, you're paying Sonnet or Opus prices for calls that could run on Haiku.

2. The Cost-Per-Request Breakdown: Haiku vs Sonnet vs Opus

These are Anthropic's published pricing as of early 2026 (always check the current rates at anthropic.com/pricing):

Model	Input (per 1M tokens)	Output (per 1M tokens)
Claude Haiku 3.5	$0.80	$4.00
Claude Sonnet 3.7	$3.00	$15.00
Claude Opus (latest)	$15.00	$75.00

To put that in per-call terms, assume an average call has 1,000 input tokens and 500 output tokens:

Haiku 3.5: $0.0008 + $0.002 = ~$0.003 per call
Sonnet 3.7: $0.003 + $0.0075 = ~$0.011 per call
Opus: $0.015 + $0.0375 = ~$0.053 per call

Run 1,000 agent calls per day through Opus and you're looking at ~$53/day. Run the same 1,000 calls through Haiku for the simple ones (let's say 700 of them are genuinely simple) and Sonnet for the rest: roughly $2.10 + $3.30 = $5.40/day.

The math on routing is real. The question is how to do it without touching your application code.

3. How RelayPlane Routing Cuts Costs Without Changing Code

RelayPlane is a local proxy that sits between OpenClaw (or any SDK client) and the Anthropic API. You point your existing setup at localhost:4100 and the proxy handles model routing based on complexity.

How complexity routing works: the proxy evaluates each request (input length, message structure, context signals) and routes it to your configured model for that complexity tier. Simple requests go to Haiku. Moderate ones go to Sonnet. Complex ones go to Opus. You define the thresholds and model mapping via the dashboard or config.

What RelayPlane tracks per request:

Input and output token counts
Model used and provider
Cost (calculated against current model pricing)
Whether prompt caching saved tokens (Anthropic cache read vs creation costs)

What it doesn't do (important to be clear): it doesn't track per-agent costs yet. Cost visibility is per-request, per-model, per-provider. For a swarm setup, that's still enough to see where the money is going.

The dashboard at http://localhost:4100 shows Model Breakdown, Provider Status, Recent Runs, and your routing configuration.

4. Setting Up ANTHROPIC_BASE_URL in 2 Minutes

Install the proxy:

npm install -g @relayplane/proxy
relayplane init
relayplane start

Set your Anthropic API key:

relayplane set-key anthropic YOUR_ANTHROPIC_KEY

Now point OpenClaw at the proxy instead of Anthropic directly:

openclaw config set models.providers.anthropic.baseUrl http://localhost:4100

That's it. Restart OpenClaw and every Anthropic call flows through the proxy.

If you're using the Anthropic SDK directly in your own code (not via OpenClaw):

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({
  baseURL: "http://localhost:4100",
  apiKey: process.env.ANTHROPIC_API_KEY,
});

Or with the standard environment variable:

export ANTHROPIC_BASE_URL=http://localhost:4100

Any SDK that respects the base URL override works transparently. For xAI/Grok integration via OpenClaw, the proxy supports xAI as a direct provider:

relayplane set-key xai YOUR_XAI_KEY

Then configure a routing rule to send specific model names or task types to xAI/Grok instead of Anthropic. Same endpoint, different destination.

Set budget limits before you do anything else:

relayplane budget set --daily 20 --action warn

This adds a warning when daily spend hits $20. You can use block instead of warn to hard-stop requests at the limit.

Enable the autostart service so the proxy survives reboots:

relayplane autostart enable

5. Real Numbers: Before vs After

I can show you the math, but I can't promise your numbers will match mine exactly. Usage patterns vary too much. Here's what I saw in my own setup:

My agent swarm was configured to default to Sonnet for all main orchestrator calls. Looking at the dashboard after setting up RelayPlane with complexity routing, roughly 60% of the calls that were going to Sonnet were simple enough to handle at Haiku quality. The routing moved those calls automatically.

At the model price differential between Haiku and Sonnet (about 3.7x on input, same ratio on output), the savings on that 60% shift are meaningful. The calls that matter (code generation, long-form writing, complex reasoning) still go to Sonnet or Opus as configured.

The real value I got wasn't just the cost reduction. It was visibility. Before the proxy, my AI spend was a black box from Anthropic: one number per month. After, I could see exactly which models were being called, how often, and what each cost. That's where you find the real waste.

To get started:

# Install
npm install -g @relayplane/proxy

# Configure and start
relayplane init
relayplane start

# Point OpenClaw at it
openclaw config set models.providers.anthropic.baseUrl http://localhost:4100

Check the dashboard at http://localhost:4100 after running a few conversations. You'll see immediately where the token spend is going.

RelayPlane is open source and free to use locally. The npm package is @relayplane/proxy. No account required to get started.

All posts