RelayPlane is an npm-native Node.js LLM proxy. Install with npm install @relayplane/proxy, point your ANTHROPIC_BASE_URL or OPENAI_BASE_URL at http://localhost:4100, and get per-request cost tracking, complexity-based model routing, and Ollama local fallback, no Docker required.

How does RelayPlane compare to LiteLLM?

LiteLLM is a Python library with a proxy server option. RelayPlane is npm-native for Node.js developers, no Python, no Docker, just npm install. RelayPlane has per-request cost tracking built in and integrates directly with Claude Code, Cursor, and OpenClaw via a simple base URL override.

How does RelayPlane compare to Portkey or Helicone?

Portkey and Helicone offer cloud observability and are npm-compatible. RelayPlane runs entirely locally (no cloud required), is open source under MIT, and focuses on cost tracking plus complexity-based routing for Node.js developers using Claude Code, Cursor, and OpenClaw.

Does RelayPlane work with Claude Code and Cursor?

Yes. RelayPlane is designed for Claude Code, Cursor, and any tool that supports ANTHROPIC_BASE_URL or OPENAI_BASE_URL. Set ANTHROPIC_BASE_URL=http://localhost:4100 and RelayPlane intercepts every request, tracks costs, and routes to cheaper models for simple tasks.

The core RelayPlane proxy and local dashboard are free forever. Open source (MIT), runs locally on your machine, your keys and prompts never leave your box. Pro ($19/mo) adds network routing intelligence and 90-day history. Max ($99/mo) adds team coordination, audit logs, and approval gates. See relayplane.com/pricing for the full breakdown.

How do I install the RelayPlane npm LLM proxy?

Run: npm install -g @relayplane/proxy. Then: relayplane init && relayplane start. The local dashboard runs at http://localhost:4100. Set ANTHROPIC_BASE_URL=http://localhost:4100 in your environment and all Claude requests route through RelayPlane automatically.

anthropicclaude-maxextra-usageapi-costscost-optimizationrelayplane

Anthropic Is Now Charging Per Token for Third-Party Tools on Max and Pro

Name: RelayPlane
Author: Matt Turley

Matt Turley·April 4, 2026·6 min read

At noon Pacific time on April 4, 2026, Anthropic changed how Max and Pro subscriptions handle third-party tools. If you use Cline, Roo Code, aider, OpenCode, OpenClaw, or any tool that isn't Claude Code or Claude.ai, your subscription token now bills at per-token API rates. On top of the subscription fee.

Claude Code and Claude.ai are still included in the flat rate. Everything else is "extra usage," and extra usage costs money.

What Actually Changed

Before today, a Max subscriber could point any tool at their subscription token and pay nothing beyond the $200/mo. Cline, aider, custom scripts, third-party agents: all covered. That's over.

To use third-party tools now, you need to enable extra usage in your account settings. When it's on, every token from those tools bills at standard API rates. When it's off, third-party tools won't work at all.

There's one more constraint that doesn't get much attention: the Max and Pro subscription token path is two-tier only. You get Sonnet 4.6 for simpler requests and Opus 4.6 for complex ones. Haiku is not available on this path at all, regardless of extra usage settings. If you want Haiku, you need a full API key. The per-token rates on the subscription path are:

Model	Input (per 1M tokens)	Output (per 1M tokens)
Opus 4.6	$5	$25
Sonnet 4.6	$3	$15
Haiku	Not available	Not available

The Math for a Typical Developer

Let's use a realistic aider or Cline session. A moderate coding task, reading some file context and going back and forth a few times, runs around 50K input tokens and 10K output tokens. That's not a long session.

Model	Input cost	Output cost	Session total
Opus 4.6	$0.25	$0.25	$0.50
Sonnet 4.6	$0.15	$0.15	$0.30

Five sessions a day on Opus adds up to $2.50/day, around $75/month. On Sonnet, the same volume is $45/month. Neither figure includes the $200 Max subscription you're also paying.

If your previous setup sent everything to Opus because "flat rate anyway," that habit just got expensive.

The Credit: Claim It Before April 17

Anthropic is giving existing subscribers a one-time credit: $100 for Pro, $200 for Max. You have until April 17, 2026. Head to console.anthropic.com, find the credit section, and redeem it. It takes under a minute and the deadline is real.

That credit buys you some runway to figure out your actual usage pattern before real bills accumulate. Use it.

What Smart Model Selection Looks Like Now

The two-tier path (Sonnet for simple, Opus for complex) used to be a theoretical optimization. Now it's a billing decision made on every request.

Simple requests: file reads, formatting checks, git status queries, docstring generation. These don't need Opus. They barely need Sonnet. On a full API key you'd route them to Haiku and pay almost nothing. On the OAT path Sonnet is the floor, but that's still 5x cheaper per token than Opus.

Complex requests: architecture decisions, debugging multi-file issues, reasoning through an ambiguous spec. These benefit from Opus. Routing them to Sonnet to save money will cost you differently, in rework and wrong output.

The practical question isn't "which model is better." It's "which model is sufficient." That distinction now has a clear dollar value attached to it.

Where Complexity Routing Fits In

Someone on r/ClaudeAI ran the numbers on a 60/40 simple/complex workload and posted a 73% cost reduction from routing by complexity. That figure will vary based on your specific task mix, but the direction is right. Defaulting to Opus for everything was reasonable before today. It isn't anymore.

When Anthropic made this change, I updated my own setup to route through RelayPlane, which does this automatically. It sits between your tool and the API, scores the complexity of each request, and picks Sonnet or Opus accordingly. One environment variable, and every third-party tool gets routed correctly without touching each tool's config.

You can also do this manually. Cline, aider, and Roo Code all let you set a default model. Set it to Sonnet and override for specific tasks. The tradeoff is a static choice instead of a per-request one, but it's better than nothing and takes two minutes.

What to Do Right Now

Claim your credit at console.anthropic.com before April 17. Then check what model your third-party tools default to and change it to Sonnet if it's Opus. For light usage, that manual change is probably enough. For heavier usage, per-request routing pays for itself quickly given the 5x price gap between the two available models.

The subscription isn't broken. Claude Code and Claude.ai still run at the flat rate. But if most of your usage came through third-party tools, you're working with a different cost model now. The sooner you get visibility into which requests are going where, the fewer surprises show up on next month's bill.

RelayPlane is open source: github.com/RelayPlane/proxy. Package: @relayplane/proxy on npm. Supports Anthropic, OpenAI, and 9 other providers. Last verified: 2026-04-04.

All posts