RelayPlane is an npm-native Node.js LLM proxy. Install with npm install @relayplane/proxy, point your ANTHROPIC_BASE_URL or OPENAI_BASE_URL at http://localhost:4100, and get per-request cost tracking, complexity-based model routing, and Ollama local fallback — no Docker required.

How does RelayPlane compare to LiteLLM?

LiteLLM is a Python library with a proxy server option. RelayPlane is npm-native for Node.js developers — no Python, no Docker, just npm install. RelayPlane has per-request cost tracking built in and integrates directly with Claude Code, Cursor, and OpenClaw via a simple base URL override.

How does RelayPlane compare to Portkey or Helicone?

Portkey and Helicone offer cloud observability and are npm-compatible. RelayPlane runs entirely locally (no cloud required), is open source under MIT, and focuses on cost tracking plus complexity-based routing for Node.js developers using Claude Code, Cursor, and OpenClaw.

Does RelayPlane work with Claude Code and Cursor?

Yes. RelayPlane is designed for Claude Code, Cursor, and any tool that supports ANTHROPIC_BASE_URL or OPENAI_BASE_URL. Set ANTHROPIC_BASE_URL=http://localhost:4100 and RelayPlane intercepts every request, tracks costs, and routes to cheaper models for simple tasks.

Yes. RelayPlane is open source (MIT license) and free to self-host. All features work locally with no account required. There are no paid tiers currently.

How do I install the RelayPlane npm LLM proxy?

Run: npm install -g @relayplane/proxy. Then: relayplane init && relayplane start. The local dashboard runs at http://localhost:4100. Set ANTHROPIC_BASE_URL=http://localhost:4100 in your environment and all Claude requests route through RelayPlane automatically.

relayplaneanthropicclaudeproxycost-savingstutorial

How to Route Claude API Requests Through a Local Proxy (With Automatic Cost Savings)

Name: RelayPlane
Author: Matt Turley

Matt Turley·March 13, 2026·5 min read

If you're running Claude in production, you're probably sending requests straight to api.anthropic.com. That works, but it means every request hits the same model at the same price, with no visibility into what you're spending per task.

A local proxy sits between your app and Anthropic's API. It intercepts requests, routes them to the right model based on complexity, tracks costs per request, and gives you a dashboard to see where your money goes. No code changes required on your end.

Setting ANTHROPIC_BASE_URL

Claude's SDKs and tools all respect the ANTHROPIC_BASE_URL environment variable. Point it at your proxy and every request flows through it automatically.

export ANTHROPIC_BASE_URL=http://localhost:4100

That's it. Your existing code keeps calling anthropic.messages.create() exactly the same way. The SDK sends the request to your proxy instead of Anthropic directly.

This works with:

Claude Code CLI (reads ANTHROPIC_BASE_URL from your shell)
Python SDK (anthropic.Anthropic() picks it up automatically)
TypeScript SDK (same behavior)
Any OpenAI-compatible client (the proxy speaks both protocols)

What the Proxy Actually Does

A good proxy does three things you can't do with direct API calls:

1. Smart model routing. Not every request needs Opus. A proxy classifies incoming requests by complexity and routes simple tasks to Haiku, moderate tasks to Sonnet, and only sends genuinely complex work to Opus. You get the same quality output at a fraction of the cost.

2. Per-request cost tracking. Instead of one big Anthropic invoice at month end, you see exactly what each request cost, which model handled it, and how many tokens it consumed. This is the difference between "AI costs $2,000/month" and "the summarization pipeline costs $0.003 per document."

3. Budget enforcement. Set daily or weekly spending limits. The proxy blocks requests that would exceed your budget instead of letting costs run away while you sleep.

Setting Up RelayPlane in 3 Lines

RelayPlane is an open-source proxy built for this exact use case. It's a single npm package, no Docker, no infrastructure.

npm install -g @relayplane/proxy
relayplane init
export ANTHROPIC_API_KEY=sk-ant-...
relayplane start

The proxy starts on port 4100. Set ANTHROPIC_BASE_URL=http://localhost:4100 and you're routing through it.

The dashboard at http://localhost:4100 shows real-time cost tracking, model usage breakdown, and request history. No separate analytics tool needed.

Real Cost Savings: Haiku vs Opus Routing

Here's what this looks like with real numbers. Say you're running an AI agent that makes 1,000 requests per day. Without routing, everything goes to Claude Sonnet:

Scenario	Model	Cost per 1K requests (est.)
No proxy	All Sonnet	~$15
With routing	70% Haiku, 25% Sonnet, 5% Opus	~$4.50

That's roughly a 70% cost reduction without changing your prompts, your code, or your output quality. The simple requests (status checks, formatting, classification) go to Haiku where they belong. The hard requests (code generation, analysis, reasoning) still get Opus.

Over a month, that's the difference between a $450 API bill and a $135 one. Over a year, you're saving nearly $4,000 on a single workload.

When You Actually Need This

If you're making fewer than 50 Claude API calls per day, the direct API is fine. The overhead of running a proxy isn't worth it at that scale.

But if you're building AI agents, running batch processing, or serving multiple users through Claude, a local proxy pays for itself on day one. The cost visibility alone is worth the 3-minute setup, even before you factor in automatic model routing.

The proxy is open source (MIT), runs locally, and your API keys never leave your machine. Your data flows through localhost, not a third-party service.

npm install -g @relayplane/proxy
relayplane init
relayplane start
# That's it. Check http://localhost:4100 for your dashboard.

All posts