OpenClaw API Costs: How to Stop Overpaying for AI in Your Agent
Running an OpenClaw-based agent and watching the API bill creep up? I've been there. I run a swarm of agents built on OpenClaw and the costs were adding up faster than I expected. Here's where the money goes and how I cut it without changing a line of application code.
1. Where OpenClaw API Costs Come From
OpenClaw makes a lot more API calls than you might expect. Every user message, every tool call, every subagent spawn, every heartbeat check, each one hits the Anthropic API. In a busy session with tool use and subagents, a single user conversation can generate dozens of API calls.
The default model in most OpenClaw setups skews toward capable (and expensive) models. That makes sense for the main orchestration logic, but a lot of those calls don't actually need Sonnet or Opus. Tool selection, routing decisions, simple confirmations, memory reads, these are tasks where a cheaper model gets the job done.
The problem is OpenClaw sends everything to the same endpoint with the same model unless you set up routing. By default, you're paying Sonnet or Opus prices for calls that could run on Haiku.
2. The Cost-Per-Request Breakdown: Haiku vs Sonnet vs Opus
These are Anthropic's published pricing as of early 2026 (always check the current rates at anthropic.com/pricing):
| Model | Input (per 1M tokens) | Output (per 1M tokens) |
|---|---|---|
| Claude Haiku 3.5 | $0.80 | $4.00 |
| Claude Sonnet 3.7 | $3.00 | $15.00 |
| Claude Opus (latest) | $15.00 | $75.00 |
To put that in per-call terms, assume an average call has 1,000 input tokens and 500 output tokens:
- Haiku 3.5: $0.0008 + $0.002 = ~$0.003 per call
- Sonnet 3.7: $0.003 + $0.0075 = ~$0.011 per call
- Opus: $0.015 + $0.0375 = ~$0.053 per call
Run 1,000 agent calls per day through Opus and you're looking at ~$53/day. Run the same 1,000 calls through Haiku for the simple ones (let's say 700 of them are genuinely simple) and Sonnet for the rest: roughly $2.10 + $3.30 = $5.40/day.
The math on routing is real. The question is how to do it without touching your application code.
3. How RelayPlane Routing Cuts Costs Without Changing Code
RelayPlane is a local proxy that sits between OpenClaw (or any SDK client) and the Anthropic API. You point your existing setup at localhost:4100 and the proxy handles model routing based on complexity.
How complexity routing works: the proxy evaluates each request (input length, message structure, context signals) and routes it to your configured model for that complexity tier. Simple requests go to Haiku. Moderate ones go to Sonnet. Complex ones go to Opus. You define the thresholds and model mapping via the dashboard or config.
What RelayPlane tracks per request:
- Input and output token counts
- Model used and provider
- Cost (calculated against current model pricing)
- Whether prompt caching saved tokens (Anthropic cache read vs creation costs)
What it doesn't do (important to be clear): it doesn't track per-agent costs yet. Cost visibility is per-request, per-model, per-provider. For a swarm setup, that's still enough to see where the money is going.
The dashboard at http://localhost:4100 shows Model Breakdown, Provider Status, Recent Runs, and your routing configuration.
4. Setting Up ANTHROPIC_BASE_URL in 2 Minutes
Install the proxy:
npm install -g @relayplane/proxy
relayplane init
relayplane startSet your Anthropic API key:
relayplane set-key anthropic YOUR_ANTHROPIC_KEYNow point OpenClaw at the proxy instead of Anthropic directly:
openclaw config set models.providers.anthropic.baseUrl http://localhost:4100That's it. Restart OpenClaw and every Anthropic call flows through the proxy.
If you're using the Anthropic SDK directly in your own code (not via OpenClaw):
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic({
baseURL: "http://localhost:4100",
apiKey: process.env.ANTHROPIC_API_KEY,
});Or with the standard environment variable:
export ANTHROPIC_BASE_URL=http://localhost:4100Any SDK that respects the base URL override works transparently. For xAI/Grok integration via OpenClaw, the proxy supports xAI as a direct provider:
relayplane set-key xai YOUR_XAI_KEYThen configure a routing rule to send specific model names or task types to xAI/Grok instead of Anthropic. Same endpoint, different destination.
Set budget limits before you do anything else:
relayplane budget set --daily 20 --action warnThis adds a warning when daily spend hits $20. You can use block instead of warn to hard-stop requests at the limit.
Enable the autostart service so the proxy survives reboots:
relayplane autostart enable5. Real Numbers: Before vs After
I can show you the math, but I can't promise your numbers will match mine exactly. Usage patterns vary too much. Here's what I saw in my own setup:
My agent swarm was configured to default to Sonnet for all main orchestrator calls. Looking at the dashboard after setting up RelayPlane with complexity routing, roughly 60% of the calls that were going to Sonnet were simple enough to handle at Haiku quality. The routing moved those calls automatically.
At the model price differential between Haiku and Sonnet (about 3.7x on input, same ratio on output), the savings on that 60% shift are meaningful. The calls that matter (code generation, long-form writing, complex reasoning) still go to Sonnet or Opus as configured.
The real value I got wasn't just the cost reduction. It was visibility. Before the proxy, my AI spend was a black box from Anthropic: one number per month. After, I could see exactly which models were being called, how often, and what each cost. That's where you find the real waste.
To get started:
# Install
npm install -g @relayplane/proxy
# Configure and start
relayplane init
relayplane start
# Point OpenClaw at it
openclaw config set models.providers.anthropic.baseUrl http://localhost:4100Check the dashboard at http://localhost:4100 after running a few conversations. You'll see immediately where the token spend is going.
RelayPlane is open source and free to use locally. The npm package is @relayplane/proxy. No account required to get started.