Introduction

v0.1.0
Updated today

RelayPlane is a production-grade AI model routing and optimization platform. Route calls across Claude, GPT-4, Gemini, and custom models with automatic fallback, cost optimization, and comprehensive observability.

Why RelayPlane?

Relay Optimize™

Intelligent fallback, latency routing, and cost ceiling controls with sub-100ms overhead.

Unified SDK

Single interface for all major AI providers. Works locally or hosted with one config change.

Open Source Core

SDK works without signup. Add RELAY_API_KEY for hosted optimization features.

Production Ready

Enterprise-grade logging, usage metering, and comprehensive error handling.

Use Cases

RelayPlane enables a wide range of applications where AI model reliability and cost control matter:

  • Fallback systems - Automatically retry failed calls with different models
  • Cost optimization - Route to cheapest model that meets quality requirements
  • Multi-agent workflows - Chain specialized models together for complex tasks
  • A/B testing - Compare model performance across providers

Quickstart

Get started with RelayPlane in minutes. This guide covers installation, basic usage, and upgrade to hosted features.

Prerequisites

  • Node.js 16 or higher
  • API keys for the AI models you want to use (Anthropic, OpenAI, etc.)
  • Optional: RelayPlane API key for hosted features

Installation

Install the RelayPlane SDK using npm:

npm install @relayplane/sdk

Or using yarn:

yarn add @relayplane/sdk

Basic Usage

Here's a simple example using RelayPlane to call Claude:

import { relay } from '@relayplane/sdk';

// Local mode - works without RelayPlane API key
const response = await relay({
  to: 'claude-3-sonnet',
  payload: {
    model: 'claude-3-sonnet-20240229',
    max_tokens: 1000,
    messages: [
      { role: 'user', content: 'What is the capital of France?' }
    ]
  }
});

console.log(response.body.content[0].text);
// → Paris is the capital of France.

Built-in Examples & Smart Features

RelayPlane provides intelligent features even with BYOK:

import RelayPlane from '@relayplane/sdk';

// Zero-config magic - automatically picks best model
const result = await RelayPlane.ask("Explain quantum computing simply");
console.log(result.response.body);
console.log(result.reasoning.rationale); // See why this model was chosen

// Built-in examples for common tasks
const summary = await RelayPlane.examples.summarize(longText, { length: 'brief' });
const translation = await RelayPlane.examples.translate('Hello world', 'Spanish');
const review = await RelayPlane.examples.codeReview(myCode, { focus: 'security' });

// Smart retry with automatic model switching
const response = await RelayPlane.smartRetry({
  to: 'claude-3-7-sonnet-20250219',
  payload: { messages: [{ role: 'user', content: 'Complex task' }] }
}, {
  maxRetries: 3,
  confidenceThreshold: 0.9
});

Authentication

RelayPlane uses your existing AI provider API keys directly (Bring Your Own Keys - BYOK):

Provider API Keys Setup

Set up your provider API keys in environment variables:

# Set provider API keys in environment (choose what you need)
export ANTHROPIC_API_KEY="sk-ant-..."     # For Claude models
export OPENAI_API_KEY="sk-..."            # For GPT models  
export GOOGLE_API_KEY="AIza..."           # For Gemini models

# Use relay with your provider keys
import { relay } from '@relayplane/sdk';
const response = await relay({ 
  to: 'claude-3-7-sonnet-20250219', 
  payload: { messages: [{ role: 'user', content: 'Hello!' }] }
});
Bring Your Own Keys (BYOK)

RelayPlane uses a BYOK architecture - you bring your own provider keys (OpenAI, Anthropic, etc.) and RelayPlane provides intelligent routing and optimization. Your data never passes through our systems unnecessarily.

Benefits: Full control over your API keys, direct billing with providers, maximum security and transparency, zero vendor lock-in.

Platform Features: Sign up at RelayPlane Dashboard for analytics, secure key management, and team collaboration features.

Getting Provider API Keys

Anthropic (Claude)

Visit console.anthropic.com

API Keys → Create Key → Copy (sk-ant-...)

OpenAI (GPT)

Visit platform.openai.com

API Keys → Create secret key → Copy (sk-...)

Google (Gemini)

Visit makersuite.google.com

Get API Key → Create/use existing → Copy (AIza...)


Relay API

The core RelayPlane function routes your requests to AI models with optional optimization.

✨ New in v1.2.0: The SDK now automatically infers payload.model from the to field, eliminating the need to specify the model twice!

Basic Relay

import { relay } from '@relayplane/sdk';

const response = await relay({
  to: 'claude-3-7-sonnet-20250219',     // Target model (auto-inferred to payload)
  payload: {                          // Model-specific payload  
    max_tokens: 1000,
    messages: [
      { role: 'user', content: 'Your prompt here' }
    ]
  },
  metadata: {                         // Optional tracking data
    user_id: 'user-123',
    session: 'session-456'
  }
});

console.log(response.relay_id);      // Unique request ID
console.log(response.latency_ms);    // Request latency
console.log(response.body);          // Model response

Supported Models

Anthropic Claude

  • claude-3-7-sonnet-20250219
  • claude-3-5-sonnet-20241022
  • claude-3-7-sonnet-20250219
  • claude-3-5-sonnet-20241022
  • claude-3-5-haiku-20241022
  • claude-3-opus-20240229
  • + simplified aliases

OpenAI GPT

  • gpt-4.1, gpt-4.1-mini
  • o3, o3-pro, o3-mini
  • o4-mini
  • gpt-4o, gpt-4o-mini
  • o1-preview, o1-mini
  • gpt-4, gpt-4-turbo
  • + legacy models

Google Gemini

  • gemini-2.5-pro
  • gemini-2.5-flash
  • gemini-2.0-flash
  • gemini-1.5-pro
  • gemini-1.5-flash
  • gemini-pro
  • + vision models

Response Format

{
  "relay_id": "evt_123",           // Unique request identifier
  "status_code": 200,              // HTTP status from provider
  "latency_ms": 450,               // Total request latency
  "body": {...},                   // Provider response body
  "fallback_used": false           // Whether fallback was triggered
}

Streaming

Stream responses token-by-token for real-time applications:

import { relay } from '@relayplane/sdk';

// Enable streaming in payload  
const response = await relay({
  to: 'claude-3-7-sonnet-20250219',
  payload: {
    max_tokens: 1000,
    stream: true,                    // Enable streaming
    messages: [
      { role: 'user', content: 'Write a poem about AI' }
    ]
  }
});

// Handle streaming response
for await (const chunk of response.stream) {
  process.stdout.write(chunk);
}
console.log('\nStreaming complete!');

HTTP Streaming

Use Server-Sent Events for web applications:

fetch('https://api.relayplane.com/api/relay', {
  method: 'POST',
  headers: {
    'x-api-key': 'your-relay-api-key',
    'Content-Type': 'application/json',
    'Accept': 'text/event-stream'
  },
  body: JSON.stringify({
    to: 'claude-3-7-sonnet-20250219',
    payload: { stream: true, /* ... */ }
  })
}).then(response => {
  const reader = response.body.getReader();
  // Handle streamed chunks
});

Error Handling

RelayPlane provides comprehensive error handling and retry logic with BYOK security:

Secure by Design

With BYOK architecture, your API keys are encrypted and stored securely. RelayPlane never sees your actual data - we only route requests and return responses.

import { relay, RelayError, RelayTimeoutError } from '@relayplane/sdk';

try {
  const response = await relay({
    to: 'claude-3-7-sonnet-20250219',
    payload: { /* ... */ }
  });
} catch (error) {
  if (error instanceof RelayTimeoutError) {
    console.log('Request timed out');
  } else if (error instanceof RelayError) {
    console.log('Relay error:', error.message);
    console.log('Error code:', error.code);
  } else {
    console.log('Unexpected error:', error);
  }
}

Common Error Codes

CodeDescriptionAction
400Bad RequestCheck payload format
401UnauthorizedVerify API key
429Rate LimitedImplement exponential backoff
500Server ErrorRetry with fallback

Rate Limits

RelayPlane enforces rate limits based on your plan:

PlanPriceMonthly LimitKey Features
Developer$0/month100K callsBYOK only, 3 provider keys, 7-day logs
Solo-Pro$29/month1M callsRelay Optimize™, 5 keys, 30-day logs
Team$99/month5M callsRBAC, 10 keys, A2A connectors
Growth$399/month20M callsWorkflow versioning, 25 keys, private marketplace
EnterpriseCustomCustomVPC/on-prem, policy engine, 99.9% SLA
BYOK Security

All plans include Bring Your Own Keys (BYOK) architecture. Your API keys and data stay with you - RelayPlane only routes requests and provides optimization.

Handling Rate Limits

import { relay, RelayRateLimitError } from '@relayplane/sdk';

async function relayWithBackoff(request, maxRetries = 3) {
  for (let attempt = 1; attempt <= maxRetries; attempt++) {
    try {
      return await relay(request);
    } catch (error) {
      if (error instanceof RelayRateLimitError && attempt < maxRetries) {
        const delay = Math.pow(2, attempt) * 1000; // Exponential backoff
        await new Promise(resolve => setTimeout(resolve, delay));
        continue;
      }
      throw error;
    }
  }
}

Advanced Features

RelayPlane includes advanced features for enterprise and power users:

Agent-to-Agent (A2A) Communication

Enable agents to communicate with each other for complex multi-step workflows:

import { relay } from '@relayplane/sdk';

// Agent workflow with A2A communication
const workflow = await relay({
  to: 'claude-3-7-sonnet-20250219',
  payload: {
    max_tokens: 1000,
    messages: [
      { role: 'user', content: 'Analyze this data and pass results to the summarizer agent' }
    ]
  },
  metadata: {
    workflow_id: 'analysis-pipeline',
    next_agent: 'summarizer-v2'  // A2A routing
  }
});

Global Optimization Settings

Configure advanced optimization settings across your entire application:

import RelayPlane from '@relayplane/sdk';

const relayplane = new RelayPlane({
  apiKey: 'your-api-key',
  globalOptimize: {
    cacheStrategy: {
      type: 'hybrid',
      maxSize: '500MB',
      evictionPolicy: 'lru',
      compression: true
    },
    batchProcessing: {
      enabled: true,
      defaultConcurrency: 5,
      maxBatchSize: 100
    },
    timeoutSettings: {
      requestTimeoutMs: 30000,
      connectionTimeoutMs: 5000,
      retryTimeoutMs: 60000
    }
  }
});

CLI Power User Commands

Use the RelayPlane CLI for powerful one-liner operations:

# Transform data files with AI operations
npx @relayplane/cli transform -i data.csv -o "clean,analyze,summarize"

# Intelligent code review
npx @relayplane/cli review -f src/ -a security,performance,style

# Batch process multiple files  
npx @relayplane/cli batch -p "*.md" -o translate -c 3

# Execute complex workflows
npx @relayplane/cli workflow -t data-pipeline -i input.json

Enterprise Features

Advanced features are available on Team, Growth, and Enterprise plans:

  • A2A Communication: Available on Team+ plans
  • Global Optimization: Available on Solo-Pro+ plans
  • CLI Power Commands: Available on all plans
  • MCP Protocol Support: Available on Growth+ plans

Logging

RelayPlane provides comprehensive logging and observability:

Request Logs

Every relay call is automatically logged with:

  • Unique request ID
  • Model used and latency
  • Fallback events (if applicable)
  • Cost estimates
  • Custom metadata

Dashboard

View logs in the RelayPlane dashboard:

Next Steps

Now that you understand the core concepts, you can:

Need help?

If you have any questions or run into issues, check out our resources: