RelayPlane is an npm-native Node.js LLM proxy. Install with npm install @relayplane/proxy, point your ANTHROPIC_BASE_URL or OPENAI_BASE_URL at http://localhost:4100, and get per-request cost tracking, complexity-based model routing, and Ollama local fallback, no Docker required.

How does RelayPlane compare to LiteLLM?

LiteLLM is a Python library with a proxy server option. RelayPlane is npm-native for Node.js developers, no Python, no Docker, just npm install. RelayPlane has per-request cost tracking built in and integrates directly with Claude Code, Cursor, and OpenClaw via a simple base URL override.

How does RelayPlane compare to Portkey or Helicone?

Portkey and Helicone offer cloud observability and are npm-compatible. RelayPlane runs entirely locally (no cloud required), is open source under MIT, and focuses on cost tracking plus complexity-based routing for Node.js developers using Claude Code, Cursor, and OpenClaw.

Does RelayPlane work with Claude Code and Cursor?

Yes. RelayPlane is designed for Claude Code, Cursor, and any tool that supports ANTHROPIC_BASE_URL or OPENAI_BASE_URL. Set ANTHROPIC_BASE_URL=http://localhost:4100 and RelayPlane intercepts every request, tracks costs, and routes to cheaper models for simple tasks.

Yes, completely. RelayPlane is free and open source (MIT). Every feature is included: full proxy, local and cloud dashboard, all 11 providers, budget enforcement, anomaly detection, hard cost caps, auto-kill, and 90-day history. It runs locally on your machine, your keys and prompts never leave your box. No tiers, no paywalls, no credit card. See relayplane.com/pricing.

How do I install the RelayPlane npm LLM proxy?

Run: npm install -g @relayplane/proxy. Then: relayplane init && relayplane start. The local dashboard runs at http://localhost:4100. Set ANTHROPIC_BASE_URL=http://localhost:4100 in your environment and all Claude requests route through RelayPlane automatically.

ai-agentsagent-opssecurityrelayplanebuild-in-public

Let Your Agents Cook

Name: RelayPlane
Author: Matt Turley

Matt Turley·March 7, 2026·4 min read

Most people building with AI agents skip the boring part.

They wire up a swarm, let it produce output, and ship whatever comes out. No review. No verification. Just vibes.

I did that too. Then my security agent caught live API credentials committed to git. By an agent I built.

So now every agent output goes through a mandatory pipeline before it touches production. Three pipelines, no exceptions.

The Pipelines

Code: @coder builds, @sentinel reviews security, @verifier validates tests. Only then it ships.

Content: @writer drafts, verification catches hallucinations and wrong URLs, then approval.

Research: @hunter/@scout gather signals, findings get distilled into the backlog. No raw research gets acted on.

An orchestrator runs every 30 minutes. It checks completions, triggers the next pipeline step, dispatches new work. When @sentinel fails a review, the fix gets auto-dispatched, re-reviewed, and ships only when clean. No human in the loop.

Why This Matters

Agents make the same classes of mistakes humans do. They commit secrets. They fail open on error paths. They default to permissive when they should default to restrictive.

Except agents do it faster and more confidently.

Today @sentinel flagged fail-open auth and a null check that would've given users perpetual free access to RelayPlane. Caught. Fixed. Re-reviewed. Shipped clean.

The pipeline overhead is minimal compared to the cost of shipping a security hole. The security agent adds maybe 3 minutes per code task. It's caught fail-open auth logic, null checks that bypass tier enforcement, and a credential leak.

What We Learned

Specialization is key. The coding agent should not review its own code. The writing agent should not fact-check its own claims. Separate agents with separate system prompts for each role.

The self-correction loop matters more than the initial review. When @sentinel fails a review, the system automatically dispatches a fix, then re-reviews. Some tasks go through two @sentinel reviews before shipping clean. No human intervention needed.

Pipeline overhead is not the bottleneck. Shipping a security hole is the bottleneck. Review adds minutes, not hours.

The Credential Incident

Early on we let agents ship output directly. That ended when our security review agent caught live API credentials committed to git by the coding agent. Credentials for Reddit and Twitter APIs, sitting in git history with no .gitignore to prevent it.

We immediately added mandatory pipelines for everything. Not optional. Not “for important stuff.” Everything.

The agents do the work. The pipeline makes sure it's safe to ship. That's the deal.

Setup

If you're running agent infrastructure, the orchestrator pattern is straightforward:

Agents complete tasks and report results to a shared log
Orchestrator cron picks up completions and triggers the next pipeline step
Each pipeline step runs in isolation with a fresh agent instance
Only when all steps pass does output reach production

RelayPlane gives every agent call cost visibility and budget limits so the orchestrator can also track spending per pipeline run. When a task would blow the budget, it blocks before any tokens are burned.

If you're shipping raw agent output to production, the question isn't whether something will go wrong. It's when, and how bad.

Running agents at scale? RelayPlane handles budget enforcement and multi-provider routing so your agents can cook without burning the kitchen down.

All posts