RelayPlane is an npm-native Node.js LLM proxy. Install with npm install @relayplane/proxy, point your ANTHROPIC_BASE_URL or OPENAI_BASE_URL at http://localhost:4100, and get per-request cost tracking, complexity-based model routing, and Ollama local fallback, no Docker required.

How does RelayPlane compare to LiteLLM?

LiteLLM is a Python library with a proxy server option. RelayPlane is npm-native for Node.js developers, no Python, no Docker, just npm install. RelayPlane has per-request cost tracking built in and integrates directly with Claude Code, Cursor, and OpenClaw via a simple base URL override.

How does RelayPlane compare to Portkey or Helicone?

Portkey and Helicone offer cloud observability and are npm-compatible. RelayPlane runs entirely locally (no cloud required), is open source under MIT, and focuses on cost tracking plus complexity-based routing for Node.js developers using Claude Code, Cursor, and OpenClaw.

Does RelayPlane work with Claude Code and Cursor?

Yes. RelayPlane is designed for Claude Code, Cursor, and any tool that supports ANTHROPIC_BASE_URL or OPENAI_BASE_URL. Set ANTHROPIC_BASE_URL=http://localhost:4100 and RelayPlane intercepts every request, tracks costs, and routes to cheaper models for simple tasks.

The core RelayPlane proxy and local dashboard are free forever. Open source (MIT), runs locally on your machine, your keys and prompts never leave your box. Pro ($19/mo) adds network routing intelligence and 90-day history. Max ($99/mo) adds team coordination, audit logs, and approval gates. See relayplane.com/pricing for the full breakdown.

How do I install the RelayPlane npm LLM proxy?

Run: npm install -g @relayplane/proxy. Then: relayplane init && relayplane start. The local dashboard runs at http://localhost:4100. Set ANTHROPIC_BASE_URL=http://localhost:4100 in your environment and all Claude requests route through RelayPlane automatically.

anthropicbillingrelayplaneopenclawmodel-routingcost-optimization

Anthropic Just Made Model Routing a Billing Decision

Name: RelayPlane
Author: Matt Turley

Matt Turley·April 4, 2026·7 min read

Starting today at noon PT, Claude Pro and Max subscriptions no longer cover third-party tools. OpenClaw, Cline, aider, Roo Code, Cursor, any harness that isn't Anthropic's own Claude Code, is now outside the subscription boundary. You can still use these tools with your Claude account, but the usage runs on pay-as-you-go billing, stacked on top of your existing subscription fee.

Anthropic is softening the blow with a one-time credit equal to your monthly plan price (up to $200 for the $200/month Max tier) for Max subscribers, which expires April 17. If you're a heavy user, that credit will last you about a week at current consumption levels. Claude Code itself is exempt. It's first-party, so Anthropic carved it out. Everything else, including the tools most power users actually rely on, is now metered.

The Real Math

Here are the current API rates:

Model	Input (per 1M tokens)	Output (per 1M tokens)
Claude Opus 4.6	$5	$25
Claude Sonnet 4.6	$3	$15
Claude Haiku 4.5	$1	$5

Those numbers need some translation into real-world cost. A single session where you feed Opus a 200K context window, say a large codebase with relevant files, costs roughly $1 just for the input. That's before Opus writes a single line back.

A heavy OpenClaw week, running multiple coding sessions per day, easily lands between $50 and $200 depending on your workflow. If you're doing orchestration with long contexts and frequent turns, you're closer to the top of that range.

The problem isn't the pricing itself. The problem is the habit. Most people using Claude subscriptions defaulted to Opus because they had no reason not to. The model was included, the tier was flat-rate, so why use anything else? That habit is now a billing decision. Every Opus call costs 1.7x what a Sonnet call costs, and 5x what Haiku costs for input.

Three Paths Forward

Option A: Enable anthropic extra usage and buy discounted bundles.

Anthropic will let you pre-purchase usage credits at a slight discount. You flip on extra usage in your account settings, point your third-party tools at your existing account, and the overage bills against your credit balance. Simplest path forward operationally. Also the most expensive. Option A is the worst deal unless you're a casual user who occasionally fires up OpenClaw for one-off tasks.

Option B: Switch to a direct API key, select models manually.

You create an API key at console.anthropic.com, update your tool configuration to point at the key instead of your subscription account, and then pick your model per-task by hand. This works. If you're disciplined, you can keep costs extremely low by defaulting to Haiku and only reaching for Sonnet or Opus when the task actually requires it. The failure mode: you get lazy, forget to switch, run three hours of Opus sessions, and wake up to a $60 bill. This path has no guardrails.

Option C: API key plus a smart routing layer.

Same API key setup as Option B, but you add a local proxy that classifies each request by complexity and routes to the right model automatically. You never think about model selection. The router does it. This is the correct long-term answer.

Why Option C Is the Right Answer

Most of what OpenClaw does doesn't need Opus. Heartbeat checks, file summarization, research routing, draft generation, these are Haiku or Sonnet tasks. Opus is for reasoning under ambiguity: complex refactors, architectural decisions, novel problem-solving where the model actually needs to think hard.

The community has already figured this out. From r/openclaw, user mehdiweb:

“Moved 85% of our OpenClaw workflows to Haiku as default... The bill went from $140/mo to about $12.”

That's not a fluke. That's what happens when you match the model to the task instead of defaulting to the most powerful option available.

The mental shift is simple: stop asking “which model do I have access to” and start asking “which model does this task actually need.” A file read doesn't need Opus. A 3,000-line refactor might. Routing intelligently means you only pay Opus rates when Opus capability is actually required.

This is exactly what RelayPlane was built to do.

How RelayPlane Fits

RelayPlane is a local proxy that sits between your tools and the Anthropic API. It runs on your machine. Nothing leaves. No data sent to a third-party server.

Setup takes about 3 minutes:

npm install -g @relayplane/proxy
relayplane init

Point your OpenClaw (or Cline, or aider) at http://localhost:4100 as the API base URL, and you're done. RelayPlane handles model routing from there.

The routing logic is straightforward. Simple tasks like file reads, summarization, and short completions go to Sonnet. Complex reasoning tasks, long context analysis, and anything requiring careful multi-step thinking goes to Opus. You can configure the thresholds or let the defaults run.

What you get:

Works with your own Anthropic API key, so you're paying API rates, not subscription markup
Model routing cost optimization built in, no configuration required to get sensible defaults
Cost tracking per session, so you can see exactly what each workflow is costing you
Local-only, nothing phoning home

Full integration guide for OpenClaw at relayplane.com/integrations/openclaw.

The Bottom Line

This change was inevitable. Anthropic couldn't sustain a $200/month flat-rate plan that let heavy users run $2,000 worth of Opus calls. The math never worked. They held off as long as they could.

The good news: the people who route intelligently will spend less after this change than they did before. Not more. If you were paying $200/month and running Opus for everything, you'll pay more. If you were paying $200/month and running Haiku and Sonnet for 90% of your work, you'll pay less on a per-token basis and only reach for Opus when you genuinely need it.

The claude subscription third party exemption is gone. That's done. The question now is whether you adapt by adding a toggle and accepting higher bills, or whether you build a smarter setup that actually matches cost to value.

The toggle is faster. The smart setup is better.

Matt Turley is the founder of RelayPlane, a local model routing proxy for AI coding tools.

All posts