Introduction

v0.1.0

Updated today

RelayPlane is a production-grade AI model routing and optimization platform. Route calls across Claude, GPT-4, Gemini, and custom models with automatic fallback, cost optimization, and comprehensive observability.

Why RelayPlane?

Relay Optimize™

Intelligent fallback, latency routing, and cost ceiling controls with sub-100ms overhead.

Unified SDK

Single interface for all major AI providers. Works locally or hosted with one config change.

Open Source Core

SDK works without signup. Add RELAY_API_KEY for hosted optimization features.

Production Ready

Enterprise-grade logging, usage metering, and comprehensive error handling.

Use Cases

RelayPlane enables a wide range of applications where AI model reliability and cost control matter:

Fallback systems - Automatically retry failed calls with different models
Cost optimization - Route to cheapest model that meets quality requirements
Multi-agent workflows - Chain specialized models together for complex tasks
A/B testing - Compare model performance across providers

Quickstart

Get started with RelayPlane in minutes. This guide covers installation, basic usage, and upgrade to hosted features.

Prerequisites

Node.js 16 or higher
API keys for the AI models you want to use (Anthropic, OpenAI, etc.)
Optional: RelayPlane API key for hosted features

Installation

Install the RelayPlane SDK using npm:

npm install @relayplane/sdk

Or using yarn:

yarn add @relayplane/sdk

Basic Usage

Here's a simple example using RelayPlane to call Claude:

import { relay } from '@relayplane/sdk';

// Local mode - works without RelayPlane API key
const response = await relay({
  to: 'claude-3-sonnet',
  payload: {
    model: 'claude-3-sonnet-20240229',
    max_tokens: 1000,
    messages: [
      { role: 'user', content: 'What is the capital of France?' }
    ]
  }
});

console.log(response.body.content[0].text);
// → Paris is the capital of France.

Hosted Mode with Optimization

Upgrade to hosted mode for optimization features:

import { configure, optimize } from '@relayplane/sdk';

// Configure with RelayPlane API key
configure({ 
  apiKey: process.env.RELAY_API_KEY 
});

// Use optimization features
const response = await optimize({
  to: 'gpt-4',
  payload: {
    model: 'gpt-4',
    messages: [{ role: 'user', content: 'Complex analysis task' }]
  }
}, {
  strategy: 'fallback',
  fallbackChain: ['gpt-4', 'claude-3-sonnet', 'gpt-3.5-turbo'],
  maxCost: 0.50
});

Authentication

RelayPlane supports two authentication modes:

Local Mode (No API Key)

Works with your existing provider API keys:

# Set provider API keys in environment
export ANTHROPIC_API_KEY="sk-ant-..."
export OPENAI_API_KEY="sk-..."
export GOOGLE_API_KEY="AIza..."

# Use relay without RelayPlane API key
import { relay } from '@relayplane/sdk';
const response = await relay({ to: 'claude-sonnet-4-20250514', payload: {...} });

Hosted Mode (RelayPlane API Key)

Get a RelayPlane API key for optimization features:

Bring Your Own Keys (BYOK)

RelayPlane uses a BYOK architecture - you bring your own provider keys (OpenAI, Anthropic, etc.) and RelayPlane routes requests intelligently. Your data never passes through our systems.

Get your API key: Sign up at https://relayplane.com/dashboard

# Set RelayPlane API key
export RELAY_API_KEY="relay-key-..."

# Configure SDK
import { configure } from '@relayplane/sdk';
configure({ apiKey: process.env.RELAY_API_KEY });

API Key Headers

For HTTP requests, include your API key in the header:

curl -H "x-api-key: your-relay-api-key" https://api.relayplane.com/api/health

Relay API

The core RelayPlane function routes your requests to AI models with optional optimization.

✨ New in v1.2.0: The SDK now automatically infers payload.model from the to field, eliminating the need to specify the model twice!

Basic Relay

import { relay } from '@relayplane/sdk';

const response = await relay({
  to: 'claude-sonnet-4-20250514',     // Target model (auto-inferred to payload)
  payload: {                          // Model-specific payload  
    max_tokens: 1000,
    messages: [
      { role: 'user', content: 'Your prompt here' }
    ]
  },
  metadata: {                         // Optional tracking data
    user_id: 'user-123',
    session: 'session-456'
  }
});

console.log(response.relay_id);      // Unique request ID
console.log(response.latency_ms);    // Request latency
console.log(response.body);          // Model response

Supported Models

Anthropic (Latest)

claude-opus-4-20250514
claude-sonnet-4-20250514
claude-3-7-sonnet-20250219
claude-3-5-sonnet-20241022
claude-3-5-haiku-20241022
+ legacy models

OpenAI (Latest)

gpt-4.1
o4-mini
o3
o3-pro
o3-mini
gpt-4o
gpt-4o-mini
+ legacy models

Google (Latest)

gemini-2.5-pro
gemini-2.5-flash
gemini-2.0-flash
gemini-1.5-pro
gemini-1.5-flash
+ legacy models

Meta & Mistral

llama-4-maverick-17b-128e
llama-4-scout-17b-16e
llama-3.3-70b
mistral-large-2411
codestral-2501
+ more models

Response Format

{
  "relay_id": "evt_123",           // Unique request identifier
  "status_code": 200,              // HTTP status from provider
  "latency_ms": 450,               // Total request latency
  "body": {...},                   // Provider response body
  "fallback_used": false           // Whether fallback was triggered
}

Streaming

Stream responses token-by-token for real-time applications:

import { relay } from '@relayplane/sdk';

// Enable streaming in payload  
const response = await relay({
  to: 'claude-sonnet-4-20250514',
  payload: {
    max_tokens: 1000,
    stream: true,                    // Enable streaming
    messages: [
      { role: 'user', content: 'Write a poem about AI' }
    ]
  }
});

// Handle streaming response
for await (const chunk of response.stream) {
  process.stdout.write(chunk);
}
console.log('\nStreaming complete!');

HTTP Streaming

Use Server-Sent Events for web applications:

fetch('https://api.relayplane.com/api/relay', {
  method: 'POST',
  headers: {
    'x-api-key': 'your-relay-api-key',
    'Content-Type': 'application/json',
    'Accept': 'text/event-stream'
  },
  body: JSON.stringify({
    to: 'claude-sonnet-4-20250514',
    payload: { stream: true, /* ... */ }
  })
}).then(response => {
  const reader = response.body.getReader();
  // Handle streamed chunks
});

Error Handling

RelayPlane provides comprehensive error handling and retry logic with BYOK security:

Secure by Design

With BYOK architecture, your API keys are encrypted and stored securely. RelayPlane never sees your actual data - we only route requests and return responses.

import { relay, RelayError, RelayTimeoutError } from '@relayplane/sdk';

try {
  const response = await relay({
    to: 'claude-sonnet-4-20250514',
    payload: { /* ... */ }
  });
} catch (error) {
  if (error instanceof RelayTimeoutError) {
    console.log('Request timed out');
  } else if (error instanceof RelayError) {
    console.log('Relay error:', error.message);
    console.log('Error code:', error.code);
  } else {
    console.log('Unexpected error:', error);
  }
}

Common Error Codes

Code	Description	Action
400	Bad Request	Check payload format
401	Unauthorized	Verify API key
429	Rate Limited	Implement exponential backoff
500	Server Error	Retry with fallback

Rate Limits

RelayPlane enforces rate limits based on your plan:

Plan	Price	Monthly Limit	Key Features
Developer	$0/month	100K calls	Basic routing, 1 provider key
Solo-Pro	$29/month	500K calls	Optimize™ & caching, 3 keys
Team	$99/month	2M calls	Team workspace, 10 keys
Growth	$399/month	20M calls	SSO & analytics, 25 keys
Enterprise	$999/month	Unlimited	SOC2 + Private deploy

BYOK Security

All plans include Bring Your Own Keys (BYOK) architecture. Your API keys and data stay with you - RelayPlane only routes requests and provides optimization.

Handling Rate Limits

import { relay, RelayRateLimitError } from '@relayplane/sdk';

async function relayWithBackoff(request, maxRetries = 3) {
  for (let attempt = 1; attempt <= maxRetries; attempt++) {
    try {
      return await relay(request);
    } catch (error) {
      if (error instanceof RelayRateLimitError && attempt < maxRetries) {
        const delay = Math.pow(2, attempt) * 1000; // Exponential backoff
        await new Promise(resolve => setTimeout(resolve, delay));
        continue;
      }
      throw error;
    }
  }
}

Logging

RelayPlane provides comprehensive logging and observability:

Request Logs

Every relay call is automatically logged with:

Unique request ID
Model used and latency
Fallback events (if applicable)
Cost estimates
Custom metadata

Dashboard

View logs in the RelayPlane dashboard:

Access your logs: https://relayplane.com/dashboard

Next Steps

Now that you understand the core concepts, you can:

Need help?

If you have any questions or run into issues, check out our resources: