Introduction
RelayPlane is a production-grade AI model routing and optimization platform. Route calls across Claude, GPT-4, Gemini, and custom models with automatic fallback, cost optimization, and comprehensive observability.
Why RelayPlane?
Relay Optimize™
Intelligent fallback, latency routing, and cost ceiling controls with sub-100ms overhead.
Unified SDK
Single interface for all major AI providers. Works locally or hosted with one config change.
Open Source Core
SDK works without signup. Add RELAY_API_KEY for hosted optimization features.
Production Ready
Enterprise-grade logging, usage metering, and comprehensive error handling.
Use Cases
RelayPlane enables a wide range of applications where AI model reliability and cost control matter:
- Fallback systems - Automatically retry failed calls with different models
- Cost optimization - Route to cheapest model that meets quality requirements
- Multi-agent workflows - Chain specialized models together for complex tasks
- A/B testing - Compare model performance across providers
Quickstart
Get started with RelayPlane in minutes. This guide covers installation, basic usage, and upgrade to hosted features.
Prerequisites
- Node.js 16 or higher
- API keys for the AI models you want to use (Anthropic, OpenAI, etc.)
- Optional: RelayPlane API key for hosted features
Installation
Install the RelayPlane SDK using npm:
npm install @relayplane/sdk
Or using yarn:
yarn add @relayplane/sdk
Basic Usage
Here's a simple example using RelayPlane to call Claude:
import { relay } from '@relayplane/sdk';
// Local mode - works without RelayPlane API key
const response = await relay({
to: 'claude-3-sonnet',
payload: {
model: 'claude-3-sonnet-20240229',
max_tokens: 1000,
messages: [
{ role: 'user', content: 'What is the capital of France?' }
]
}
});
console.log(response.body.content[0].text);
// → Paris is the capital of France.
Built-in Examples & Smart Features
RelayPlane provides intelligent features even with BYOK:
import RelayPlane from '@relayplane/sdk';
// Zero-config magic - automatically picks best model
const result = await RelayPlane.ask("Explain quantum computing simply");
console.log(result.response.body);
console.log(result.reasoning.rationale); // See why this model was chosen
// Built-in examples for common tasks
const summary = await RelayPlane.examples.summarize(longText, { length: 'brief' });
const translation = await RelayPlane.examples.translate('Hello world', 'Spanish');
const review = await RelayPlane.examples.codeReview(myCode, { focus: 'security' });
// Smart retry with automatic model switching
const response = await RelayPlane.smartRetry({
to: 'claude-3-7-sonnet-20250219',
payload: { messages: [{ role: 'user', content: 'Complex task' }] }
}, {
maxRetries: 3,
confidenceThreshold: 0.9
});
Authentication
RelayPlane uses your existing AI provider API keys directly (Bring Your Own Keys - BYOK):
Provider API Keys Setup
Set up your provider API keys in environment variables:
# Set provider API keys in environment (choose what you need)
export ANTHROPIC_API_KEY="sk-ant-..." # For Claude models
export OPENAI_API_KEY="sk-..." # For GPT models
export GOOGLE_API_KEY="AIza..." # For Gemini models
# Use relay with your provider keys
import { relay } from '@relayplane/sdk';
const response = await relay({
to: 'claude-3-7-sonnet-20250219',
payload: { messages: [{ role: 'user', content: 'Hello!' }] }
});
RelayPlane uses a BYOK architecture - you bring your own provider keys (OpenAI, Anthropic, etc.) and RelayPlane provides intelligent routing and optimization. Your data never passes through our systems unnecessarily.
Benefits: Full control over your API keys, direct billing with providers, maximum security and transparency, zero vendor lock-in.
Platform Features: Sign up at RelayPlane Dashboard for analytics, secure key management, and team collaboration features.
Getting Provider API Keys
Anthropic (Claude)
Visit console.anthropic.com
API Keys → Create Key → Copy (sk-ant-...)
OpenAI (GPT)
Visit platform.openai.com
API Keys → Create secret key → Copy (sk-...)
Google (Gemini)
Visit makersuite.google.com
Get API Key → Create/use existing → Copy (AIza...)
Relay API
The core RelayPlane function routes your requests to AI models with optional optimization.
✨ New in v1.2.0: The SDK now automatically infers payload.model
from the to
field, eliminating the need to specify the model twice!
Basic Relay
import { relay } from '@relayplane/sdk';
const response = await relay({
to: 'claude-3-7-sonnet-20250219', // Target model (auto-inferred to payload)
payload: { // Model-specific payload
max_tokens: 1000,
messages: [
{ role: 'user', content: 'Your prompt here' }
]
},
metadata: { // Optional tracking data
user_id: 'user-123',
session: 'session-456'
}
});
console.log(response.relay_id); // Unique request ID
console.log(response.latency_ms); // Request latency
console.log(response.body); // Model response
Supported Models
Anthropic Claude
claude-3-7-sonnet-20250219
claude-3-5-sonnet-20241022
claude-3-7-sonnet-20250219
claude-3-5-sonnet-20241022
claude-3-5-haiku-20241022
claude-3-opus-20240229
- + simplified aliases
OpenAI GPT
gpt-4.1
,gpt-4.1-mini
o3
,o3-pro
,o3-mini
o4-mini
gpt-4o
,gpt-4o-mini
o1-preview
,o1-mini
gpt-4
,gpt-4-turbo
- + legacy models
Google Gemini
gemini-2.5-pro
gemini-2.5-flash
gemini-2.0-flash
gemini-1.5-pro
gemini-1.5-flash
gemini-pro
- + vision models
Response Format
{
"relay_id": "evt_123", // Unique request identifier
"status_code": 200, // HTTP status from provider
"latency_ms": 450, // Total request latency
"body": {...}, // Provider response body
"fallback_used": false // Whether fallback was triggered
}
Streaming
Stream responses token-by-token for real-time applications:
import { relay } from '@relayplane/sdk';
// Enable streaming in payload
const response = await relay({
to: 'claude-3-7-sonnet-20250219',
payload: {
max_tokens: 1000,
stream: true, // Enable streaming
messages: [
{ role: 'user', content: 'Write a poem about AI' }
]
}
});
// Handle streaming response
for await (const chunk of response.stream) {
process.stdout.write(chunk);
}
console.log('\nStreaming complete!');
HTTP Streaming
Use Server-Sent Events for web applications:
fetch('https://api.relayplane.com/api/relay', {
method: 'POST',
headers: {
'x-api-key': 'your-relay-api-key',
'Content-Type': 'application/json',
'Accept': 'text/event-stream'
},
body: JSON.stringify({
to: 'claude-3-7-sonnet-20250219',
payload: { stream: true, /* ... */ }
})
}).then(response => {
const reader = response.body.getReader();
// Handle streamed chunks
});
Error Handling
RelayPlane provides comprehensive error handling and retry logic with BYOK security:
With BYOK architecture, your API keys are encrypted and stored securely. RelayPlane never sees your actual data - we only route requests and return responses.
import { relay, RelayError, RelayTimeoutError } from '@relayplane/sdk';
try {
const response = await relay({
to: 'claude-3-7-sonnet-20250219',
payload: { /* ... */ }
});
} catch (error) {
if (error instanceof RelayTimeoutError) {
console.log('Request timed out');
} else if (error instanceof RelayError) {
console.log('Relay error:', error.message);
console.log('Error code:', error.code);
} else {
console.log('Unexpected error:', error);
}
}
Common Error Codes
Code | Description | Action |
---|---|---|
400 | Bad Request | Check payload format |
401 | Unauthorized | Verify API key |
429 | Rate Limited | Implement exponential backoff |
500 | Server Error | Retry with fallback |
Rate Limits
RelayPlane enforces rate limits based on your plan:
Plan | Price | Monthly Limit | Key Features |
---|---|---|---|
Developer | $0/month | 100K calls | BYOK only, 3 provider keys, 7-day logs |
Solo-Pro | $29/month | 1M calls | Relay Optimize™, 5 keys, 30-day logs |
Team | $99/month | 5M calls | RBAC, 10 keys, A2A connectors |
Growth | $399/month | 20M calls | Workflow versioning, 25 keys, private marketplace |
Enterprise | Custom | Custom | VPC/on-prem, policy engine, 99.9% SLA |
All plans include Bring Your Own Keys (BYOK) architecture. Your API keys and data stay with you - RelayPlane only routes requests and provides optimization.
Handling Rate Limits
import { relay, RelayRateLimitError } from '@relayplane/sdk';
async function relayWithBackoff(request, maxRetries = 3) {
for (let attempt = 1; attempt <= maxRetries; attempt++) {
try {
return await relay(request);
} catch (error) {
if (error instanceof RelayRateLimitError && attempt < maxRetries) {
const delay = Math.pow(2, attempt) * 1000; // Exponential backoff
await new Promise(resolve => setTimeout(resolve, delay));
continue;
}
throw error;
}
}
}
Advanced Features
RelayPlane includes advanced features for enterprise and power users:
Agent-to-Agent (A2A) Communication
Enable agents to communicate with each other for complex multi-step workflows:
import { relay } from '@relayplane/sdk';
// Agent workflow with A2A communication
const workflow = await relay({
to: 'claude-3-7-sonnet-20250219',
payload: {
max_tokens: 1000,
messages: [
{ role: 'user', content: 'Analyze this data and pass results to the summarizer agent' }
]
},
metadata: {
workflow_id: 'analysis-pipeline',
next_agent: 'summarizer-v2' // A2A routing
}
});
Global Optimization Settings
Configure advanced optimization settings across your entire application:
import RelayPlane from '@relayplane/sdk';
const relayplane = new RelayPlane({
apiKey: 'your-api-key',
globalOptimize: {
cacheStrategy: {
type: 'hybrid',
maxSize: '500MB',
evictionPolicy: 'lru',
compression: true
},
batchProcessing: {
enabled: true,
defaultConcurrency: 5,
maxBatchSize: 100
},
timeoutSettings: {
requestTimeoutMs: 30000,
connectionTimeoutMs: 5000,
retryTimeoutMs: 60000
}
}
});
CLI Power User Commands
Use the RelayPlane CLI for powerful one-liner operations:
# Transform data files with AI operations
npx @relayplane/cli transform -i data.csv -o "clean,analyze,summarize"
# Intelligent code review
npx @relayplane/cli review -f src/ -a security,performance,style
# Batch process multiple files
npx @relayplane/cli batch -p "*.md" -o translate -c 3
# Execute complex workflows
npx @relayplane/cli workflow -t data-pipeline -i input.json
Enterprise Features
Advanced features are available on Team, Growth, and Enterprise plans:
- • A2A Communication: Available on Team+ plans
- • Global Optimization: Available on Solo-Pro+ plans
- • CLI Power Commands: Available on all plans
- • MCP Protocol Support: Available on Growth+ plans
Logging
RelayPlane provides comprehensive logging and observability:
Request Logs
Every relay call is automatically logged with:
- Unique request ID
- Model used and latency
- Fallback events (if applicable)
- Cost estimates
- Custom metadata
Dashboard
View logs in the RelayPlane dashboard:
Access your logs: https://relayplane.com/dashboard
Next Steps
Now that you understand the core concepts, you can:
- Learn the JavaScript/TypeScript SDK
- Explore the complete API reference
- Get help with troubleshooting
- Try the interactive playground
- Get your API key and start building
Need help?
If you have any questions or run into issues, check out our resources: