Providers Guide

Configure and integrate multiple AI providers in your RelayPlane workflows. Mix and match models from OpenAI, Anthropic, Google, xAI, and local deployments.

RelayPlane supports multi-provider workflows out of the box. You can use different providers for different steps, optimizing for cost, speed, or capability.

OpenAI

OpenAI provides the GPT family of models, including the powerful GPT-4o series. These models excel at general-purpose tasks, reasoning, and multimodal capabilities.

Available Models

•gpt-4o - Latest flagship model with vision, best overall performance
•gpt-4o-mini - Cost-effective, fast responses for simpler tasks
•gpt-4o-vision - Optimized for image analysis and visual reasoning

Configuration

openai-workflow.ts

1import { relay } from '@relayplane/sdk'
2
3const workflow = relay
4  .workflow('openai-example')
5  .step('analyze', {
6    systemPrompt: 'Analyze the provided text and extract key insights.'
7  })
8  .with('openai:gpt-4o')
9
10const result = await workflow.run({
11  apiKeys: {
12    openai: process.env.OPENAI_API_KEY
13  },
14  input: {
15    text: 'Your content here...'
16  }
17})

Rate Limits and Best Practices

•Default rate limits vary by tier (Free: 3 RPM, Tier 1: 500 RPM, Tier 5: 10,000 RPM)
•Use gpt-4o-mini for high-volume, cost-sensitive tasks
•Implement retry logic for transient errors (RelayPlane handles this automatically)
•Cache responses for repeated queries to reduce API calls

Best Use Cases

•General Purpose: Content generation, summarization, Q&A
•Vision Tasks: Image analysis, document processing, visual Q&A
•Code Generation: Writing, debugging, and explaining code

Anthropic

Anthropic's Claude models are known for their strong reasoning, analysis capabilities, and industry-leading context windows up to 200,000 tokens.

Available Models

•claude-3-5-sonnet - Best balance of intelligence and speed (200k context)
•claude-3-opus - Most capable for complex reasoning tasks (200k context)
•claude-3-haiku - Fastest and most cost-effective (200k context)

Configuration

anthropic-workflow.ts

1import { relay } from '@relayplane/sdk'
2
3const workflow = relay
4  .workflow('anthropic-example')
5  .step('deep-analysis', {
6    systemPrompt: `You are an expert analyst. Thoroughly analyze
7the following document and provide detailed insights with
8supporting evidence from the text.`
9  })
10  .with('anthropic:claude-3-5-sonnet')
11
12const result = await workflow.run({
13  apiKeys: {
14    anthropic: process.env.ANTHROPIC_API_KEY
15  },
16  input: {
17    document: longDocument // Can be up to 200k tokens
18  }
19})

Context Windows

All Claude 3 models support up to 200,000 tokens of context - equivalent to approximately 150,000 words or a 500-page book. This makes Claude ideal for:

•Analyzing entire codebases
•Processing lengthy legal or financial documents
•Maintaining long conversation histories
•Synthesizing information from multiple sources

Best Use Cases

•Deep Analysis: Research papers, legal contracts, financial reports
•Long Context: Book summarization, codebase analysis, document comparison
•Complex Reasoning: Multi-step problem solving, strategic planning

Use claude-3-haiku for initial processing steps, then pass results to claude-3-5-sonnet or claude-3-opus for final analysis. This optimizes cost while maintaining quality.

Google (Gemini)

Google's Gemini models offer strong multimodal capabilities and integration with Google Cloud services.

Available Models

•gemini-pro - Best for text-based tasks (32k context)
•gemini-pro-vision - Multimodal with image understanding

Configuration

google-workflow.ts

1import { relay } from '@relayplane/sdk'
2
3const workflow = relay
4  .workflow('google-example')
5  .step('process', {
6    systemPrompt: 'Process the input and provide a structured response.'
7  })
8  .with('google:gemini-pro')
9
10const result = await workflow.run({
11  apiKeys: {
12    google: process.env.GOOGLE_API_KEY
13  },
14  input: {
15    content: 'Your content here...'
16  }
17})

Best Use Cases

•Multimodal Tasks: Image and text combined processing
•Google Cloud Integration: Workflows that leverage GCP services
•Search Enhancement: Content that benefits from Google's knowledge

xAI (Grok)

xAI's Grok models provide real-time knowledge and a distinctive approach to AI reasoning with fewer content restrictions.

Available Models

•grok-beta - Latest Grok model with real-time knowledge access

Configuration

xai-workflow.ts

1import { relay } from '@relayplane/sdk'
2
3const workflow = relay
4  .workflow('xai-example')
5  .step('research', {
6    systemPrompt: 'Research the topic and provide current information.'
7  })
8  .with('xai:grok-beta')
9
10const result = await workflow.run({
11  apiKeys: {
12    xai: process.env.XAI_API_KEY
13  },
14  input: {
15    topic: 'Latest developments in AI'
16  }
17})

Best Use Cases

•Real-time Information: Current events, trending topics, live data
•Research Tasks: Gathering and synthesizing current information
•Creative Content: Less restrictive content generation

Local (Ollama)

Run models locally using Ollama for privacy-sensitive data, offline operation, or cost optimization on high-volume tasks.

Setup Instructions

First, install Ollama and pull your desired model:

terminal

1# Install Ollama (macOS/Linux)
2curl -fsSL https://ollama.com/install.sh | sh
3
4# Pull models
5ollama pull llama3
6ollama pull mistral
7ollama pull codellama
8
9# Start Ollama server (runs on http://localhost:11434)
10ollama serve

Available Models

•llama3 - Meta's latest open model, excellent general performance
•mistral - Fast and efficient for many tasks
•codellama - Optimized for code generation and analysis
•mixtral - Mixture of experts, good balance of speed/quality

Configuration

local-workflow.ts

1import { relay } from '@relayplane/sdk'
2
3const workflow = relay
4  .workflow('local-example')
5  .step('process', {
6    systemPrompt: 'Process the sensitive data locally.'
7  })
8  .with('local:llama3')
9
10// No API key needed for local models
11const result = await workflow.run({
12  input: {
13    data: sensitiveData
14  },
15  // Optional: customize Ollama endpoint
16  localEndpoint: 'http://localhost:11434'
17})

Performance Considerations

•GPU Acceleration: NVIDIA or AMD GPU significantly improves inference speed
•Memory Requirements: 7B models need ~8GB RAM, 13B needs ~16GB, 70B needs ~64GB
•Quantization: Use quantized models (Q4, Q5) for faster inference with minimal quality loss
•Concurrent Requests: Local models handle one request at a time by default

Local models are typically slower than cloud APIs. For production workloads requiring low latency, consider cloud providers or dedicated GPU infrastructure.

Multi-Provider Configuration

RelayPlane makes it easy to use multiple providers in a single workflow, allowing you to leverage the strengths of each model.

relay.configure() API

Set up default configuration for all workflows:

config.ts

1import { relay } from '@relayplane/sdk'
2
3// Configure defaults for all workflows
4relay.configure({
5  apiKeys: {
6    openai: process.env.OPENAI_API_KEY,
7    anthropic: process.env.ANTHROPIC_API_KEY,
8    google: process.env.GOOGLE_API_KEY,
9    xai: process.env.XAI_API_KEY
10  },
11  defaults: {
12    retries: 3,
13    timeout: 30000,
14    temperature: 0.7
15  },
16  localEndpoint: 'http://localhost:11434'
17})

Environment Variables

Store your API keys securely in environment variables:

.env

1# .env file
2OPENAI_API_KEY=sk-...
3ANTHROPIC_API_KEY=sk-ant-...
4GOOGLE_API_KEY=AIza...
5XAI_API_KEY=xai-...
6
7# Optional: Custom Ollama endpoint
8OLLAMA_ENDPOINT=http://localhost:11434

Per-Run API Key Overrides

Override configured API keys for specific workflow runs:

per-run-override.ts

1const result = await workflow.run({
2  // Override specific keys for this run
3  apiKeys: {
4    openai: customerApiKey, // Use customer's own key
5    anthropic: process.env.ANTHROPIC_API_KEY // Use default
6  },
7  input: {
8    content: 'Process this...'
9  }
10})

Multi-Provider Workflow Example

multi-provider-workflow.ts

1import { relay } from '@relayplane/sdk'
2
3const multiProviderWorkflow = relay
4  .workflow('multi-provider-analysis')
5
6  // Step 1: Fast initial processing with GPT-4o-mini
7  .step('extract', {
8    systemPrompt: 'Extract key data points from the document.'
9  })
10  .with('openai:gpt-4o-mini')
11
12  // Step 2: Deep analysis with Claude
13  .step('analyze', {
14    systemPrompt: `Perform deep analysis on the extracted data:
15    {{extract.output}}`
16  })
17  .with('anthropic:claude-3-5-sonnet')
18  .depends('extract')
19
20  // Step 3: Generate report with GPT-4o
21  .step('report', {
22    systemPrompt: `Generate a professional report based on:
23    {{analyze.output}}`
24  })
25  .with('openai:gpt-4o')
26  .depends('analyze')
27
28  // Step 4: Privacy-sensitive summary with local model
29  .step('internal-summary', {
30    systemPrompt: 'Create an internal summary with confidential notes.'
31  })
32  .with('local:llama3')
33  .depends('analyze')
34
35const result = await multiProviderWorkflow.run({
36  apiKeys: {
37    openai: process.env.OPENAI_API_KEY,
38    anthropic: process.env.ANTHROPIC_API_KEY
39  },
40  input: {
41    document: documentContent
42  }
43})

Combine providers strategically: use fast/cheap models for preprocessing, powerful models for complex analysis, and local models for sensitive data.

Cost Comparison

Understanding pricing helps you optimize workflows for cost-effectiveness. Prices are per 1M tokens (as of early 2025).

Token Pricing by Provider

Provider	Model	Input ($/1M)	Output ($/1M)
OpenAI	`gpt-4o`	$2.50	$10.00
OpenAI	`gpt-4o-mini`	$0.15	$0.60
Anthropic	`claude-3-5-sonnet`	$3.00	$15.00
Anthropic	`claude-3-opus`	$15.00	$75.00
Anthropic	`claude-3-haiku`	$0.25	$1.25
Google	`gemini-pro`	$0.50	$1.50
xAI	`grok-beta`	$5.00	$15.00
Local	`llama3/mistral`	$0.00	$0.00

Model Capability Matrix

Model	Context Window	Vision	Speed
`gpt-4o`	128k	Yes	Fast
`gpt-4o-mini`	128k	Yes	Very Fast
`claude-3-5-sonnet`	200k	Yes	Fast
`claude-3-opus`	200k	Yes	Moderate
`claude-3-haiku`	200k	Yes	Very Fast
`gemini-pro`	32k	No	Fast
`gemini-pro-vision`	16k	Yes	Fast
`grok-beta`	128k	No	Fast
`llama3`	8k	No	Varies

Prices are subject to change. Check provider websites for current pricing. Local models have no API costs but require compute resources.

Next Steps

•Quickstart Guide - Build your first workflow
•API Reference - Complete SDK documentation
•Example Workflows - Production-ready templates
•Core Concepts - DAG validation, error handling, retries