Providers Guide
Configure and integrate multiple AI providers in your RelayPlane workflows. Mix and match models from OpenAI, Anthropic, Google, xAI, Perplexity, and local deployments.
OpenAI
OpenAI provides the GPT-5 series with configurable reasoning, GPT-4.1, and o-series reasoning models. The GPT-5 family features 1M token context windows and breakthrough reasoning capabilities for coding and agentic tasks.
Available Models
- •
gpt-5.2- New! Latest flagship with enhanced reasoning and 1M context - •
gpt-5.2-pro- Smarter and more precise responses for complex tasks - •
gpt-5.1- Best model for coding and agentic tasks with configurable reasoning - •
gpt-5-mini- Faster, cost-efficient version of GPT-5 for well-defined tasks - •
gpt-5-nano- Fastest, most cost-efficient version of GPT-5 - •
gpt-5- Previous intelligent reasoning model - •
gpt-4.1- Smartest non-reasoning model with 1M context - •
o3- Advanced reasoning model - •
o4-mini- Fast reasoning model
Configuration
1import { relay } from '@relayplane/sdk'23// Configure your provider (do this once at app startup)4relay.configure({5 providers: {6 openai: { apiKey: process.env.OPENAI_API_KEY! }7 }8})910const workflow = relay11 .workflow('openai-example')12 .step('analyze')13 .with('openai:gpt-5.2')14 .prompt('Analyze the provided text and extract key insights: {{input.text}}')1516const result = await workflow.run({ text: 'Your content here...' })Best Use Cases
- •GPT-5.2: Latest flagship for complex reasoning, coding, and agentic tasks
- •GPT-5.2 Pro: Most precise responses, research, complex analysis
- •GPT-5 Nano: High-volume classification, simple transformations
- •GPT-4.1: General purpose, coding without reasoning overhead
Anthropic
Anthropic's Claude 4 series features hybrid thinking modes and industry-leading coding performance. Claude Opus 4.5 is the most intelligent, while Sonnet 4.5 offers the best balance of capability and speed.
Available Models
- •
claude-opus-4-5-20251101- Most intelligent, effort parameter for complex tasks - •
claude-sonnet-4-5-20250929- Best coding model, strongest for agents - •
claude-haiku-4-5-20251001- Fast and affordable for high-volume tasks - •
claude-opus-4-1-20250805- Superior precision for agentic tasks - •
claude-sonnet-4-20250514- Upgraded reasoning with hybrid thinking - •
claude-3-7-sonnet-20250219- Extended thinking capabilities - •
claude-3-5-haiku-20241022- Fast and affordable (legacy)
Configuration
1import { relay } from '@relayplane/sdk'23// Configure your provider (do this once at app startup)4relay.configure({5 providers: {6 anthropic: { apiKey: process.env.ANTHROPIC_API_KEY! }7 }8})910const analysisPrompt = `You are an expert analyst. Thoroughly analyze11the following document and provide detailed insights with12supporting evidence from the text.1314Document: {{input.document}}`1516const workflow = relay17 .workflow('anthropic-example')18 .step('deep-analysis')19 .with('anthropic:claude-sonnet-4-5-20250929')20 .prompt(analysisPrompt)2122// longDocument can be up to 200k tokens23const result = await workflow.run({ document: longDocument })Best Use Cases
- •Opus 4.5: Most complex reasoning, research, strategic analysis
- •Sonnet 4.5: Coding, agentic workflows, computer use
- •Claude 3.5 Haiku: Fast preprocessing, high-volume tasks
claude-haiku-4-5-20251001 for initial processing steps, then pass results to claude-sonnet-4-5-20250929 for final analysis. This optimizes cost while maintaining quality.Google (Gemini)
Google's Gemini 3 and 2.5 series offer massive 1M+ token context windows and multimodal capabilities including text, image, video, audio, and PDF processing.
Available Models
- •
gemini-3-pro- Most powerful Gemini with text, image, video, audio, PDF - •
gemini-2.5-pro- Advanced multimodal with 1M context - •
gemini-2.5-flash- Fast multimodal for general tasks - •
gemini-2.5-flash-lite- Ultra-efficient for high-frequency tasks - •
gemini-2.0-flash- Cost-effective multimodal
Configuration
1import { relay } from '@relayplane/sdk'23// Configure your provider (do this once at app startup)4relay.configure({5 providers: {6 google: { apiKey: process.env.GOOGLE_API_KEY! }7 }8})910const workflow = relay11 .workflow('google-example')12 .step('process')13 .with('google:gemini-2.5-flash')14 .prompt('Process the input and provide a structured response: {{input.content}}')1516const result = await workflow.run({ content: 'Your content here...' })Best Use Cases
- •Video/Audio Processing: Gemini 3 Pro supports native video and audio
- •Long Documents: 1M+ context for entire codebases or books
- •Cost Optimization: Gemini 2.5 Flash-Lite at $0.02/1M input tokens
xAI (Grok)
xAI's Grok 4 models feature 256K context windows and advanced reasoning with planning capabilities. Grok 4 excels at real-time knowledge tasks.
Available Models
- •
grok-4- Latest flagship with reasoning and planning (256K context) - •
grok-4-fast- Faster variant of Grok 4 - •
grok-3- Previous generation flagship - •
grok-3-mini- Smaller, faster Grok for quick tasks
Configuration
1import { relay } from '@relayplane/sdk'23// Configure your provider (do this once at app startup)4relay.configure({5 providers: {6 xai: { apiKey: process.env.XAI_API_KEY! }7 }8})910const workflow = relay11 .workflow('xai-example')12 .step('research')13 .with('xai:grok-4')14 .prompt('Research the topic and provide current information: {{input.topic}}')1516const result = await workflow.run({ topic: 'Latest developments in AI' })Best Use Cases
- •Real-time Information: Current events, trending topics, live data
- •Reasoning Tasks: Complex planning and analysis
- •Creative Content: Less restrictive content generation
Perplexity
Perplexity's Sonar models combine language understanding with real-time web search capabilities, making them ideal for research and information retrieval tasks.
Available Models
- •
sonar-pro- Most capable Sonar model with advanced search - •
sonar- Balanced performance and cost - •
sonar-reasoning-pro- Enhanced reasoning with web search - •
sonar-reasoning- Reasoning capabilities with search - •
sonar-deep-research- Comprehensive research mode
Configuration
1import { relay } from '@relayplane/sdk'23// Configure your provider (do this once at app startup)4relay.configure({5 providers: {6 perplexity: { apiKey: process.env.PERPLEXITY_API_KEY! }7 }8})910const workflow = relay11 .workflow('perplexity-example')12 .step('research')13 .with('perplexity:sonar-pro')14 .prompt('Research the following topic with current information: {{input.topic}}')1516const result = await workflow.run({ topic: 'Latest AI developments 2025' })Best Use Cases
- •Real-time Research: Topics requiring up-to-date web information
- •Fact Checking: Verify claims against current sources
- •Competitive Analysis: Market research with citations
Local (Ollama)
Run models locally using Ollama for privacy-sensitive data, offline operation, or cost optimization on high-volume tasks.
local: prefix for all Ollama models. The prefix refers to the Ollama server running locally on your machine. Example: local:llama3.3, local:mistralSetup Instructions
First, install Ollama and pull your desired model:
1# Install Ollama (macOS/Linux)2curl -fsSL https://ollama.com/install.sh | sh34# Pull models5ollama pull llama3.36ollama pull qwen2.57ollama pull deepseek-r18ollama pull mistral910# Start Ollama server (runs on http://localhost:11434)11ollama serveAvailable Models
- •
llama3.3- Meta's latest 70B open model with vision - •
llama3.2- Multimodal open-source model - •
qwen2.5- Alibaba's powerful open model - •
deepseek-r1- Advanced reasoning model - •
mistral- Fast and efficient for many tasks
Configuration
1import { relay } from '@relayplane/sdk'23// No API key needed for local models, but you can set custom endpoint4relay.configure({5 providers: {6 local: { baseUrl: 'http://localhost:11434' } // Optional: customize Ollama endpoint7 }8})910const workflow = relay11 .workflow('local-example')12 .step('process')13 .with('local:llama3.3')14 .prompt('Process the sensitive data locally: {{input.data}}')1516const result = await workflow.run({ data: sensitiveData })Multi-Provider Configuration
RelayPlane makes it easy to use multiple providers in a single workflow, allowing you to leverage the strengths of each model.
Multi-Provider Workflow Example
1import { relay } from '@relayplane/sdk'23// Configure all providers once at app startup4relay.configure({5 providers: {6 openai: { apiKey: process.env.OPENAI_API_KEY! },7 anthropic: { apiKey: process.env.ANTHROPIC_API_KEY! }8 }9})1011const multiProviderWorkflow = relay12 .workflow('multi-provider-analysis')1314 // Step 1: Fast initial processing with GPT-5-nano15 .step('extract')16 .with('openai:gpt-5-nano')17 .prompt('Extract key data points from the document: {{input.document}}')1819 // Step 2: Deep analysis with Claude Sonnet 4.520 .step('analyze')21 .with('anthropic:claude-sonnet-4-5-20250929')22 .depends('extract')23 .prompt('Perform deep analysis on the extracted data: {{extract.output}}')2425 // Step 3: Generate report with GPT-5.226 .step('report')27 .with('openai:gpt-5.2')28 .depends('analyze')29 .prompt('Generate a professional report based on: {{analyze.output}}')3031 // Step 4: Privacy-sensitive summary with local model32 .step('internal-summary')33 .with('local:llama3.3')34 .depends('analyze')35 .prompt('Create an internal summary with confidential notes based on: {{analyze.output}}')3637const result = await multiProviderWorkflow.run({ document: documentContent })Cost Comparison
Understanding pricing helps you optimize workflows for cost-effectiveness. Prices are per 1M tokens (as of December 2025).
Token Pricing by Provider
| Provider | Model | Input ($/1M) | Output ($/1M) |
|---|---|---|---|
| OpenAI | gpt-5.2 | $5.00 | $20.00 |
| OpenAI | gpt-5.2-pro | $10.00 | $40.00 |
| OpenAI | gpt-5-mini | $1.00 | $4.00 |
| OpenAI | gpt-5-nano | $0.25 | $1.00 |
| OpenAI | gpt-4.1 | $2.00 | $8.00 |
| Anthropic | claude-opus-4.5 | $15.00 | $75.00 |
| Anthropic | claude-sonnet-4.5 | $3.00 | $15.00 |
| Anthropic | claude-3.5-haiku | $0.80 | $4.00 |
gemini-3-pro | $1.25 | $5.00 | |
gemini-2.5-flash | $0.075 | $0.30 | |
| xAI | grok-4 | $3.00 | $15.00 |
| Local | llama3.3/qwen2.5 | $0.00 | $0.00 |
Model Capability Matrix
| Model | Context Window | Vision | Speed |
|---|---|---|---|
gpt-5.2 | 1M | Yes | Fast |
gpt-5.2-pro | 1M | Yes | Medium |
gpt-4.1 | 1M | Yes | Fast |
claude-sonnet-4.5 | 200k | Yes | Fast |
claude-opus-4.5 | 200k | Yes | Slow |
gemini-3-pro | 1M | Yes | Medium |
gemini-2.5-flash | 1M | Yes | Fast |
grok-4 | 256k | Yes | Fast |
llama3.3 | 128k | Yes | Varies |
Next Steps
- •Quickstart Guide - Build your first workflow
- •API Reference - Complete SDK documentation
- •Example Workflows - Production-ready templates
- •Core Concepts - DAG validation, error handling, retries