Introduction

v0.1.0
Updated today

RelayPlane is a production-grade AI model routing and optimization platform. Route calls across Claude, GPT-4, Gemini, and custom models with automatic fallback, cost optimization, and comprehensive observability.

Why RelayPlane?

Relay Optimize™

Intelligent fallback, latency routing, and cost ceiling controls with sub-100ms overhead.

Unified SDK

Single interface for all major AI providers. Works locally or hosted with one config change.

Open Source Core

SDK works without signup. Add RELAY_API_KEY for hosted optimization features.

Production Ready

Enterprise-grade logging, usage metering, and comprehensive error handling.

Use Cases

RelayPlane enables a wide range of applications where AI model reliability and cost control matter:

  • Fallback systems - Automatically retry failed calls with different models
  • Cost optimization - Route to cheapest model that meets quality requirements
  • Multi-agent workflows - Chain specialized models together for complex tasks
  • A/B testing - Compare model performance across providers

Quickstart

Get started with RelayPlane in minutes. This guide covers installation, basic usage, and upgrade to hosted features.

Prerequisites

  • Node.js 16 or higher
  • API keys for the AI models you want to use (Anthropic, OpenAI, etc.)
  • Optional: RelayPlane API key for hosted features

Installation

Install the RelayPlane SDK using npm:

npm install @relayplane/sdk

Or using yarn:

yarn add @relayplane/sdk

Basic Usage

Here's a simple example using RelayPlane to call Claude:

import { relay } from '@relayplane/sdk';

// Local mode - works without RelayPlane API key
const response = await relay({
  to: 'claude-3-sonnet',
  payload: {
    model: 'claude-3-sonnet-20240229',
    max_tokens: 1000,
    messages: [
      { role: 'user', content: 'What is the capital of France?' }
    ]
  }
});

console.log(response.body.content[0].text);
// → Paris is the capital of France.

Hosted Mode with Optimization

Upgrade to hosted mode for optimization features:

import { configure, optimize } from '@relayplane/sdk';

// Configure with RelayPlane API key
configure({ 
  apiKey: process.env.RELAY_API_KEY 
});

// Use optimization features
const response = await optimize({
  to: 'gpt-4',
  payload: {
    model: 'gpt-4',
    messages: [{ role: 'user', content: 'Complex analysis task' }]
  }
}, {
  strategy: 'fallback',
  fallbackChain: ['gpt-4', 'claude-3-sonnet', 'gpt-3.5-turbo'],
  maxCost: 0.50
});

Authentication

RelayPlane supports two authentication modes:

Local Mode (No API Key)

Works with your existing provider API keys:

# Set provider API keys in environment
export ANTHROPIC_API_KEY="sk-ant-..."
export OPENAI_API_KEY="sk-..."
export GOOGLE_API_KEY="AIza..."

# Use relay without RelayPlane API key
import { relay } from '@relayplane/sdk';
const response = await relay({ to: 'claude-sonnet-4-20250514', payload: {...} });

Hosted Mode (RelayPlane API Key)

Get a RelayPlane API key for optimization features:

Bring Your Own Keys (BYOK)

RelayPlane uses a BYOK architecture - you bring your own provider keys (OpenAI, Anthropic, etc.) and RelayPlane routes requests intelligently. Your data never passes through our systems.

Get your API key: Sign up at https://relayplane.com/dashboard

# Set RelayPlane API key
export RELAY_API_KEY="relay-key-..."

# Configure SDK
import { configure } from '@relayplane/sdk';
configure({ apiKey: process.env.RELAY_API_KEY });

API Key Headers

For HTTP requests, include your API key in the header:

curl -H "x-api-key: your-relay-api-key" https://api.relayplane.com/api/health

Relay API

The core RelayPlane function routes your requests to AI models with optional optimization.

✨ New in v1.2.0: The SDK now automatically infers payload.model from the to field, eliminating the need to specify the model twice!

Basic Relay

import { relay } from '@relayplane/sdk';

const response = await relay({
  to: 'claude-sonnet-4-20250514',     // Target model (auto-inferred to payload)
  payload: {                          // Model-specific payload  
    max_tokens: 1000,
    messages: [
      { role: 'user', content: 'Your prompt here' }
    ]
  },
  metadata: {                         // Optional tracking data
    user_id: 'user-123',
    session: 'session-456'
  }
});

console.log(response.relay_id);      // Unique request ID
console.log(response.latency_ms);    // Request latency
console.log(response.body);          // Model response

Supported Models

Anthropic (Latest)

  • claude-opus-4-20250514
  • claude-sonnet-4-20250514
  • claude-3-7-sonnet-20250219
  • claude-3-5-sonnet-20241022
  • claude-3-5-haiku-20241022
  • + legacy models

OpenAI (Latest)

  • gpt-4.1
  • o4-mini
  • o3
  • o3-pro
  • o3-mini
  • gpt-4o
  • gpt-4o-mini
  • + legacy models

Google (Latest)

  • gemini-2.5-pro
  • gemini-2.5-flash
  • gemini-2.0-flash
  • gemini-1.5-pro
  • gemini-1.5-flash
  • + legacy models

Meta & Mistral

  • llama-4-maverick-17b-128e
  • llama-4-scout-17b-16e
  • llama-3.3-70b
  • mistral-large-2411
  • codestral-2501
  • + more models

Response Format

{
  "relay_id": "evt_123",           // Unique request identifier
  "status_code": 200,              // HTTP status from provider
  "latency_ms": 450,               // Total request latency
  "body": {...},                   // Provider response body
  "fallback_used": false           // Whether fallback was triggered
}

Streaming

Stream responses token-by-token for real-time applications:

import { relay } from '@relayplane/sdk';

// Enable streaming in payload  
const response = await relay({
  to: 'claude-sonnet-4-20250514',
  payload: {
    max_tokens: 1000,
    stream: true,                    // Enable streaming
    messages: [
      { role: 'user', content: 'Write a poem about AI' }
    ]
  }
});

// Handle streaming response
for await (const chunk of response.stream) {
  process.stdout.write(chunk);
}
console.log('\nStreaming complete!');

HTTP Streaming

Use Server-Sent Events for web applications:

fetch('https://api.relayplane.com/api/relay', {
  method: 'POST',
  headers: {
    'x-api-key': 'your-relay-api-key',
    'Content-Type': 'application/json',
    'Accept': 'text/event-stream'
  },
  body: JSON.stringify({
    to: 'claude-sonnet-4-20250514',
    payload: { stream: true, /* ... */ }
  })
}).then(response => {
  const reader = response.body.getReader();
  // Handle streamed chunks
});

Error Handling

RelayPlane provides comprehensive error handling and retry logic with BYOK security:

Secure by Design

With BYOK architecture, your API keys are encrypted and stored securely. RelayPlane never sees your actual data - we only route requests and return responses.

import { relay, RelayError, RelayTimeoutError } from '@relayplane/sdk';

try {
  const response = await relay({
    to: 'claude-sonnet-4-20250514',
    payload: { /* ... */ }
  });
} catch (error) {
  if (error instanceof RelayTimeoutError) {
    console.log('Request timed out');
  } else if (error instanceof RelayError) {
    console.log('Relay error:', error.message);
    console.log('Error code:', error.code);
  } else {
    console.log('Unexpected error:', error);
  }
}

Common Error Codes

CodeDescriptionAction
400Bad RequestCheck payload format
401UnauthorizedVerify API key
429Rate LimitedImplement exponential backoff
500Server ErrorRetry with fallback

Rate Limits

RelayPlane enforces rate limits based on your plan:

PlanPriceMonthly LimitKey Features
Developer$0/month100K callsBasic routing, 1 provider key
Solo-Pro$29/month500K callsOptimize™ & caching, 3 keys
Team$99/month2M callsTeam workspace, 10 keys
Growth$399/month20M callsSSO & analytics, 25 keys
Enterprise$999/monthUnlimitedSOC2 + Private deploy
BYOK Security

All plans include Bring Your Own Keys (BYOK) architecture. Your API keys and data stay with you - RelayPlane only routes requests and provides optimization.

Handling Rate Limits

import { relay, RelayRateLimitError } from '@relayplane/sdk';

async function relayWithBackoff(request, maxRetries = 3) {
  for (let attempt = 1; attempt <= maxRetries; attempt++) {
    try {
      return await relay(request);
    } catch (error) {
      if (error instanceof RelayRateLimitError && attempt < maxRetries) {
        const delay = Math.pow(2, attempt) * 1000; // Exponential backoff
        await new Promise(resolve => setTimeout(resolve, delay));
        continue;
      }
      throw error;
    }
  }
}

Logging

RelayPlane provides comprehensive logging and observability:

Request Logs

Every relay call is automatically logged with:

  • Unique request ID
  • Model used and latency
  • Fallback events (if applicable)
  • Cost estimates
  • Custom metadata

Dashboard

View logs in the RelayPlane dashboard:

Next Steps

Now that you understand the core concepts, you can:

Need help?

If you have any questions or run into issues, check out our resources: