OpenClaw API Costs Explained: What You're Paying and How to Track Every Token
I've been running OpenClaw in production for months now. The first month, my Anthropic bill was roughly what I expected. The second month, it was 3x higher. Not because pricing changed, but because I had zero visibility into where tokens were actually going.
If you're running OpenClaw and want to understand your openclaw api costs before they surprise you, this is the breakdown I wish I'd had on day one.
How OpenClaw Billing Actually Works
OpenClaw doesn't charge you for API access. There's no markup, no platform fee on tokens. Every API call your agents make passes straight through to Anthropic, and you pay Anthropic's per-token rates directly.
As of early 2026, that means:
- Claude Haiku 4.5: $1.00 / $5.00 per million tokens (input/output)
- Claude Sonnet 4.6: $3.00 / $15.00 per million tokens (input/output)
- Claude Opus 4.6: $5.00 / $25.00 per million tokens (input/output)
The openclaw api price is Anthropic's price. Full stop. Your bill is determined by which models your agents hit, how many tokens they consume, and how often they run. OpenClaw is the orchestration layer, not the billing layer.
This sounds simple, but it creates a real problem: OpenClaw doesn't give you a per-request cost breakdown out of the box. You see a single Anthropic invoice at the end of the month with no way to attribute costs to specific agents, sessions, or tasks.
The Hidden Cost: Context Window Creep in Long Sessions
The biggest cost driver I've found isn't the model you pick. It's context window accumulation in long-running agent sessions.
Here's how it works. Every message in a session gets appended to the context window. By message 15, you might be sending 80K input tokens per request, because the full conversation history ships with every call. A session that starts at $0.02/request can quietly climb to $0.40/request by the end.
For autonomous agents running on cron jobs or heartbeat loops, this compounds fast. I had one agent running a 45-minute research session that burned through $12 in a single loop, mostly because the context window grew unchecked.
The fix is architectural: keep sessions short, split complex tasks into sub-agent spawns (each gets a fresh context), and be aggressive about what you include in system prompts.
Per-Request Cost Tracking: See Exactly Where Your Money Goes
This is the problem that pushed me to build RelayPlane. I needed per-request cost attribution, broken down by agent, session, and task.
RelayPlane sits between OpenClaw and Anthropic as a local proxy. Every request passes through it, and it logs the exact token counts, model used, and calculated cost. Setup takes about two minutes:
npm install -g @relayplane/proxyThen point OpenClaw's Anthropic provider at the proxy:
openclaw config set models.providers.anthropic.baseUrl http://localhost:4100That's it. Your existing API key passes through untouched. RelayPlane doesn't store keys or modify requests. It just observes and logs.
Now every request shows up in your local dashboard with:
- Input/output token counts
- Model used (and whether it was auto-routed)
- Calculated cost at current rates
- Agent ID and session ID for attribution
- Timestamp and latency
When I finally saw the per-request breakdown, I found that 60% of my monthly cost came from a single agent that was re-reading large files into context on every heartbeat cycle. Five-minute fix. Saved $80/month.
Model Routing: Automatic Haiku/Sonnet/Opus Selection by Complexity
OpenClaw supports model routing, where the system picks Haiku, Sonnet, or Opus based on task complexity. This is one of the best ways to reduce your openclaw api cost without sacrificing output quality.
The idea is straightforward: simple classification tasks and short responses don't need Opus. A well-configured routing setup sends maybe 70% of requests to Sonnet, 20% to Haiku, and only 10% to Opus for genuinely complex reasoning.
RelayPlane's complexity-based routing does this automatically. It analyzes the prompt complexity before forwarding and picks the cheapest model that can handle it. In my setup, this cut costs by roughly 40% compared to running everything on Opus.
The tradeoff is real though. Haiku occasionally fumbles on multi-step tool use. Sonnet handles 90% of agent tasks perfectly. Opus is worth the premium for planning, code architecture, and anything requiring long chains of reasoning. The key is matching model to task, not defaulting to the most expensive option.
Setting Budget Caps to Prevent Surprises
After my first surprise bill, I set hard budget caps. You should too.
At the agent level, if you're using Claude Code for coding tasks, the --max-budget-usd flag kills the session when it hits your limit:
claude --max-budget-usd 5.00 --message "implement the auth module"At the proxy level, RelayPlane lets you set daily spending caps that reject requests once you've hit the threshold. This is your circuit breaker for runaway agent sessions.
The combination I run: $5 per coding session, $50 daily cap across all agents, with Slack alerts at 80% of each threshold. I haven't had a surprise bill since.
The Bottom Line
OpenClaw api costs are Anthropic costs. The platform passes through every token at face value. But without per-request tracking, you're flying blind on where those tokens go, and context window creep will inflate your bill faster than you expect.
My stack for cost control: short sessions, aggressive sub-agent splitting, complexity-based model routing, and RelayPlane for full per-request visibility. Total setup time is under 10 minutes, and it paid for itself in the first week.
Your agents are burning tokens right now. The question is whether you know where.
RelayPlane is open source and free to use locally. The npm package is @relayplane/proxy. No account required to get started.