Providers Guide
Configure and integrate multiple AI providers in your RelayPlane workflows. Mix and match models from OpenAI, Anthropic, Google, xAI, and local deployments.
OpenAI
OpenAI provides the GPT family of models, including the powerful GPT-4o series. These models excel at general-purpose tasks, reasoning, and multimodal capabilities.
Available Models
- •
gpt-4o- Latest flagship model with vision, best overall performance - •
gpt-4o-mini- Cost-effective, fast responses for simpler tasks - •
gpt-4o-vision- Optimized for image analysis and visual reasoning
Configuration
1import { relay } from '@relayplane/sdk'23const workflow = relay4 .workflow('openai-example')5 .step('analyze', {6 systemPrompt: 'Analyze the provided text and extract key insights.'7 })8 .with('openai:gpt-4o')910const result = await workflow.run({11 apiKeys: {12 openai: process.env.OPENAI_API_KEY13 },14 input: {15 text: 'Your content here...'16 }17})Rate Limits and Best Practices
- •Default rate limits vary by tier (Free: 3 RPM, Tier 1: 500 RPM, Tier 5: 10,000 RPM)
- •Use
gpt-4o-minifor high-volume, cost-sensitive tasks - •Implement retry logic for transient errors (RelayPlane handles this automatically)
- •Cache responses for repeated queries to reduce API calls
Best Use Cases
- •General Purpose: Content generation, summarization, Q&A
- •Vision Tasks: Image analysis, document processing, visual Q&A
- •Code Generation: Writing, debugging, and explaining code
Anthropic
Anthropic's Claude models are known for their strong reasoning, analysis capabilities, and industry-leading context windows up to 200,000 tokens.
Available Models
- •
claude-3-5-sonnet- Best balance of intelligence and speed (200k context) - •
claude-3-opus- Most capable for complex reasoning tasks (200k context) - •
claude-3-haiku- Fastest and most cost-effective (200k context)
Configuration
1import { relay } from '@relayplane/sdk'23const workflow = relay4 .workflow('anthropic-example')5 .step('deep-analysis', {6 systemPrompt: `You are an expert analyst. Thoroughly analyze7the following document and provide detailed insights with8supporting evidence from the text.`9 })10 .with('anthropic:claude-3-5-sonnet')1112const result = await workflow.run({13 apiKeys: {14 anthropic: process.env.ANTHROPIC_API_KEY15 },16 input: {17 document: longDocument // Can be up to 200k tokens18 }19})Context Windows
All Claude 3 models support up to 200,000 tokens of context - equivalent to approximately 150,000 words or a 500-page book. This makes Claude ideal for:
- •Analyzing entire codebases
- •Processing lengthy legal or financial documents
- •Maintaining long conversation histories
- •Synthesizing information from multiple sources
Best Use Cases
- •Deep Analysis: Research papers, legal contracts, financial reports
- •Long Context: Book summarization, codebase analysis, document comparison
- •Complex Reasoning: Multi-step problem solving, strategic planning
claude-3-haiku for initial processing steps, then pass results to claude-3-5-sonnet or claude-3-opus for final analysis. This optimizes cost while maintaining quality.Google (Gemini)
Google's Gemini models offer strong multimodal capabilities and integration with Google Cloud services.
Available Models
- •
gemini-pro- Best for text-based tasks (32k context) - •
gemini-pro-vision- Multimodal with image understanding
Configuration
1import { relay } from '@relayplane/sdk'23const workflow = relay4 .workflow('google-example')5 .step('process', {6 systemPrompt: 'Process the input and provide a structured response.'7 })8 .with('google:gemini-pro')910const result = await workflow.run({11 apiKeys: {12 google: process.env.GOOGLE_API_KEY13 },14 input: {15 content: 'Your content here...'16 }17})Best Use Cases
- •Multimodal Tasks: Image and text combined processing
- •Google Cloud Integration: Workflows that leverage GCP services
- •Search Enhancement: Content that benefits from Google's knowledge
xAI (Grok)
xAI's Grok models provide real-time knowledge and a distinctive approach to AI reasoning with fewer content restrictions.
Available Models
- •
grok-beta- Latest Grok model with real-time knowledge access
Configuration
1import { relay } from '@relayplane/sdk'23const workflow = relay4 .workflow('xai-example')5 .step('research', {6 systemPrompt: 'Research the topic and provide current information.'7 })8 .with('xai:grok-beta')910const result = await workflow.run({11 apiKeys: {12 xai: process.env.XAI_API_KEY13 },14 input: {15 topic: 'Latest developments in AI'16 }17})Best Use Cases
- •Real-time Information: Current events, trending topics, live data
- •Research Tasks: Gathering and synthesizing current information
- •Creative Content: Less restrictive content generation
Local (Ollama)
Run models locally using Ollama for privacy-sensitive data, offline operation, or cost optimization on high-volume tasks.
Setup Instructions
First, install Ollama and pull your desired model:
1# Install Ollama (macOS/Linux)2curl -fsSL https://ollama.com/install.sh | sh34# Pull models5ollama pull llama36ollama pull mistral7ollama pull codellama89# Start Ollama server (runs on http://localhost:11434)10ollama serveAvailable Models
- •
llama3- Meta's latest open model, excellent general performance - •
mistral- Fast and efficient for many tasks - •
codellama- Optimized for code generation and analysis - •
mixtral- Mixture of experts, good balance of speed/quality
Configuration
1import { relay } from '@relayplane/sdk'23const workflow = relay4 .workflow('local-example')5 .step('process', {6 systemPrompt: 'Process the sensitive data locally.'7 })8 .with('local:llama3')910// No API key needed for local models11const result = await workflow.run({12 input: {13 data: sensitiveData14 },15 // Optional: customize Ollama endpoint16 localEndpoint: 'http://localhost:11434'17})Performance Considerations
- •GPU Acceleration: NVIDIA or AMD GPU significantly improves inference speed
- •Memory Requirements: 7B models need ~8GB RAM, 13B needs ~16GB, 70B needs ~64GB
- •Quantization: Use quantized models (Q4, Q5) for faster inference with minimal quality loss
- •Concurrent Requests: Local models handle one request at a time by default
Multi-Provider Configuration
RelayPlane makes it easy to use multiple providers in a single workflow, allowing you to leverage the strengths of each model.
relay.configure() API
Set up default configuration for all workflows:
1import { relay } from '@relayplane/sdk'23// Configure defaults for all workflows4relay.configure({5 apiKeys: {6 openai: process.env.OPENAI_API_KEY,7 anthropic: process.env.ANTHROPIC_API_KEY,8 google: process.env.GOOGLE_API_KEY,9 xai: process.env.XAI_API_KEY10 },11 defaults: {12 retries: 3,13 timeout: 30000,14 temperature: 0.715 },16 localEndpoint: 'http://localhost:11434'17})Environment Variables
Store your API keys securely in environment variables:
1# .env file2OPENAI_API_KEY=sk-...3ANTHROPIC_API_KEY=sk-ant-...4GOOGLE_API_KEY=AIza...5XAI_API_KEY=xai-...67# Optional: Custom Ollama endpoint8OLLAMA_ENDPOINT=http://localhost:11434Per-Run API Key Overrides
Override configured API keys for specific workflow runs:
1const result = await workflow.run({2 // Override specific keys for this run3 apiKeys: {4 openai: customerApiKey, // Use customer's own key5 anthropic: process.env.ANTHROPIC_API_KEY // Use default6 },7 input: {8 content: 'Process this...'9 }10})Multi-Provider Workflow Example
1import { relay } from '@relayplane/sdk'23const multiProviderWorkflow = relay4 .workflow('multi-provider-analysis')56 // Step 1: Fast initial processing with GPT-4o-mini7 .step('extract', {8 systemPrompt: 'Extract key data points from the document.'9 })10 .with('openai:gpt-4o-mini')1112 // Step 2: Deep analysis with Claude13 .step('analyze', {14 systemPrompt: `Perform deep analysis on the extracted data:15 {{extract.output}}`16 })17 .with('anthropic:claude-3-5-sonnet')18 .depends('extract')1920 // Step 3: Generate report with GPT-4o21 .step('report', {22 systemPrompt: `Generate a professional report based on:23 {{analyze.output}}`24 })25 .with('openai:gpt-4o')26 .depends('analyze')2728 // Step 4: Privacy-sensitive summary with local model29 .step('internal-summary', {30 systemPrompt: 'Create an internal summary with confidential notes.'31 })32 .with('local:llama3')33 .depends('analyze')3435const result = await multiProviderWorkflow.run({36 apiKeys: {37 openai: process.env.OPENAI_API_KEY,38 anthropic: process.env.ANTHROPIC_API_KEY39 },40 input: {41 document: documentContent42 }43})Cost Comparison
Understanding pricing helps you optimize workflows for cost-effectiveness. Prices are per 1M tokens (as of early 2025).
Token Pricing by Provider
| Provider | Model | Input ($/1M) | Output ($/1M) |
|---|---|---|---|
| OpenAI | gpt-4o | $2.50 | $10.00 |
| OpenAI | gpt-4o-mini | $0.15 | $0.60 |
| Anthropic | claude-3-5-sonnet | $3.00 | $15.00 |
| Anthropic | claude-3-opus | $15.00 | $75.00 |
| Anthropic | claude-3-haiku | $0.25 | $1.25 |
gemini-pro | $0.50 | $1.50 | |
| xAI | grok-beta | $5.00 | $15.00 |
| Local | llama3/mistral | $0.00 | $0.00 |
Model Capability Matrix
| Model | Context Window | Vision | Speed |
|---|---|---|---|
gpt-4o | 128k | Yes | Fast |
gpt-4o-mini | 128k | Yes | Very Fast |
claude-3-5-sonnet | 200k | Yes | Fast |
claude-3-opus | 200k | Yes | Moderate |
claude-3-haiku | 200k | Yes | Very Fast |
gemini-pro | 32k | No | Fast |
gemini-pro-vision | 16k | Yes | Fast |
grok-beta | 128k | No | Fast |
llama3 | 8k | No | Varies |
Next Steps
- •Quickstart Guide - Build your first workflow
- •API Reference - Complete SDK documentation
- •Example Workflows - Production-ready templates
- •Core Concepts - DAG validation, error handling, retries