Anthropic Just Made Model Routing a Billing Decision
Starting today at noon PT, Claude Pro and Max subscriptions no longer cover third-party tools. OpenClaw, Cline, aider, Roo Code, Cursor — any harness that isn't Anthropic's own Claude Code — is now outside the subscription boundary. You can still use these tools with your Claude account, but the usage runs on pay-as-you-go billing, stacked on top of your existing subscription fee.
Anthropic is softening the blow with a one-time credit equal to your monthly plan price (up to $200 for the $200/month Max tier) for Max subscribers, which expires April 17. If you're a heavy user, that credit will last you about a week at current consumption levels. Claude Code itself is exempt. It's first-party, so Anthropic carved it out. Everything else — including the tools most power users actually rely on — is now metered.
The Real Math
Here are the current API rates:
| Model | Input (per 1M tokens) | Output (per 1M tokens) |
|---|---|---|
| Claude Opus 4.6 | $5 | $25 |
| Claude Sonnet 4.6 | $3 | $15 |
| Claude Haiku 4.5 | $1 | $5 |
Those numbers need some translation into real-world cost. A single session where you feed Opus a 200K context window — say a large codebase with relevant files — costs roughly $1 just for the input. That's before Opus writes a single line back.
A heavy OpenClaw week, running multiple coding sessions per day, easily lands between $50 and $200 depending on your workflow. If you're doing orchestration with long contexts and frequent turns, you're closer to the top of that range.
The problem isn't the pricing itself. The problem is the habit. Most people using Claude subscriptions defaulted to Opus because they had no reason not to. The model was included, the tier was flat-rate, so why use anything else? That habit is now a billing decision. Every Opus call costs 1.7x what a Sonnet call costs, and 5x what Haiku costs for input.
Three Paths Forward
Option A: Enable anthropic extra usage and buy discounted bundles.
Anthropic will let you pre-purchase usage credits at a slight discount. You flip on extra usage in your account settings, point your third-party tools at your existing account, and the overage bills against your credit balance. Simplest path forward operationally. Also the most expensive. Option A is the worst deal unless you're a casual user who occasionally fires up OpenClaw for one-off tasks.
Option B: Switch to a direct API key, select models manually.
You create an API key at console.anthropic.com, update your tool configuration to point at the key instead of your subscription account, and then pick your model per-task by hand. This works. If you're disciplined, you can keep costs extremely low by defaulting to Haiku and only reaching for Sonnet or Opus when the task actually requires it. The failure mode: you get lazy, forget to switch, run three hours of Opus sessions, and wake up to a $60 bill. This path has no guardrails.
Option C: API key plus a smart routing layer.
Same API key setup as Option B, but you add a local proxy that classifies each request by complexity and routes to the right model automatically. You never think about model selection. The router does it. This is the correct long-term answer.
Why Option C Is the Right Answer
Most of what OpenClaw does doesn't need Opus. Heartbeat checks, file summarization, research routing, draft generation — these are Haiku or Sonnet tasks. Opus is for reasoning under ambiguity: complex refactors, architectural decisions, novel problem-solving where the model actually needs to think hard.
The community has already figured this out. From r/openclaw, user mehdiweb:
“Moved 85% of our OpenClaw workflows to Haiku as default... The bill went from $140/mo to about $12.”
That's not a fluke. That's what happens when you match the model to the task instead of defaulting to the most powerful option available.
The mental shift is simple: stop asking “which model do I have access to” and start asking “which model does this task actually need.” A file read doesn't need Opus. A 3,000-line refactor might. Routing intelligently means you only pay Opus rates when Opus capability is actually required.
This is exactly what RelayPlane was built to do.
How RelayPlane Fits
RelayPlane is a local proxy that sits between your tools and the Anthropic API. It runs on your machine. Nothing leaves. No data sent to a third-party server.
Setup takes about 3 minutes:
npm install -g @relayplane/proxy
relayplane initPoint your OpenClaw (or Cline, or aider) at http://localhost:4100 as the API base URL, and you're done. RelayPlane handles model routing from there.
The routing logic is straightforward. Simple tasks like file reads, summarization, and short completions go to Sonnet. Complex reasoning tasks, long context analysis, and anything requiring careful multi-step thinking goes to Opus. You can configure the thresholds or let the defaults run.
What you get:
- Works with your own Anthropic API key, so you're paying API rates, not subscription markup
- Model routing cost optimization built in — no configuration required to get sensible defaults
- Cost tracking per session, so you can see exactly what each workflow is costing you
- Local-only — nothing phoning home
Full integration guide for OpenClaw at relayplane.com/integrations/openclaw.
The Bottom Line
This change was inevitable. Anthropic couldn't sustain a $200/month flat-rate plan that let heavy users run $2,000 worth of Opus calls. The math never worked. They held off as long as they could.
The good news: the people who route intelligently will spend less after this change than they did before. Not more. If you were paying $200/month and running Opus for everything, you'll pay more. If you were paying $200/month and running Haiku and Sonnet for 90% of your work, you'll pay less on a per-token basis and only reach for Opus when you genuinely need it.
The claude subscription third party exemption is gone. That's done. The question now is whether you adapt by adding a toggle and accepting higher bills, or whether you build a smarter setup that actually matches cost to value.
The toggle is faster. The smart setup is better.
Matt Turley is the founder of RelayPlane, a local model routing proxy for AI coding tools.