Providers Guide

Configure and integrate multiple AI providers in your RelayPlane workflows. Mix and match models from OpenAI, Anthropic, Google, xAI, Perplexity, and local deployments.

RelayPlane supports multi-provider workflows out of the box. You can use different providers for different steps, optimizing for cost, speed, or capability.

OpenAI

OpenAI provides the GPT-5 series with configurable reasoning, GPT-4.1, and o-series reasoning models. The GPT-5 family features 1M token context windows and breakthrough reasoning capabilities for coding and agentic tasks.

Available Models

  • gpt-5.2 - New! Latest flagship with enhanced reasoning and 1M context
  • gpt-5.2-pro - Smarter and more precise responses for complex tasks
  • gpt-5.1 - Best model for coding and agentic tasks with configurable reasoning
  • gpt-5-mini - Faster, cost-efficient version of GPT-5 for well-defined tasks
  • gpt-5-nano - Fastest, most cost-efficient version of GPT-5
  • gpt-5 - Previous intelligent reasoning model
  • gpt-4.1 - Smartest non-reasoning model with 1M context
  • o3 - Advanced reasoning model
  • o4-mini - Fast reasoning model

Configuration

openai-workflow.ts
1import { relay } from '@relayplane/sdk'
2
3// Configure your provider (do this once at app startup)
4relay.configure({
5 providers: {
6 openai: { apiKey: process.env.OPENAI_API_KEY! }
7 }
8})
9
10const workflow = relay
11 .workflow('openai-example')
12 .step('analyze')
13 .with('openai:gpt-5.2')
14 .prompt('Analyze the provided text and extract key insights: {{input.text}}')
15
16const result = await workflow.run({ text: 'Your content here...' })

Best Use Cases

  • GPT-5.2: Latest flagship for complex reasoning, coding, and agentic tasks
  • GPT-5.2 Pro: Most precise responses, research, complex analysis
  • GPT-5 Nano: High-volume classification, simple transformations
  • GPT-4.1: General purpose, coding without reasoning overhead

Anthropic

Anthropic's Claude 4 series features hybrid thinking modes and industry-leading coding performance. Claude Opus 4.5 is the most intelligent, while Sonnet 4.5 offers the best balance of capability and speed.

Available Models

  • claude-opus-4-5-20251101 - Most intelligent, effort parameter for complex tasks
  • claude-sonnet-4-5-20250929 - Best coding model, strongest for agents
  • claude-haiku-4-5-20251001 - Fast and affordable for high-volume tasks
  • claude-opus-4-1-20250805 - Superior precision for agentic tasks
  • claude-sonnet-4-20250514 - Upgraded reasoning with hybrid thinking
  • claude-3-7-sonnet-20250219 - Extended thinking capabilities
  • claude-3-5-haiku-20241022 - Fast and affordable (legacy)

Configuration

anthropic-workflow.ts
1import { relay } from '@relayplane/sdk'
2
3// Configure your provider (do this once at app startup)
4relay.configure({
5 providers: {
6 anthropic: { apiKey: process.env.ANTHROPIC_API_KEY! }
7 }
8})
9
10const analysisPrompt = `You are an expert analyst. Thoroughly analyze
11the following document and provide detailed insights with
12supporting evidence from the text.
13
14Document: {{input.document}}`
15
16const workflow = relay
17 .workflow('anthropic-example')
18 .step('deep-analysis')
19 .with('anthropic:claude-sonnet-4-5-20250929')
20 .prompt(analysisPrompt)
21
22// longDocument can be up to 200k tokens
23const result = await workflow.run({ document: longDocument })

Best Use Cases

  • Opus 4.5: Most complex reasoning, research, strategic analysis
  • Sonnet 4.5: Coding, agentic workflows, computer use
  • Claude 3.5 Haiku: Fast preprocessing, high-volume tasks
Use claude-haiku-4-5-20251001 for initial processing steps, then pass results to claude-sonnet-4-5-20250929 for final analysis. This optimizes cost while maintaining quality.

Google (Gemini)

Google's Gemini 3 and 2.5 series offer massive 1M+ token context windows and multimodal capabilities including text, image, video, audio, and PDF processing.

Available Models

  • gemini-3-pro - Most powerful Gemini with text, image, video, audio, PDF
  • gemini-2.5-pro - Advanced multimodal with 1M context
  • gemini-2.5-flash - Fast multimodal for general tasks
  • gemini-2.5-flash-lite - Ultra-efficient for high-frequency tasks
  • gemini-2.0-flash - Cost-effective multimodal

Configuration

google-workflow.ts
1import { relay } from '@relayplane/sdk'
2
3// Configure your provider (do this once at app startup)
4relay.configure({
5 providers: {
6 google: { apiKey: process.env.GOOGLE_API_KEY! }
7 }
8})
9
10const workflow = relay
11 .workflow('google-example')
12 .step('process')
13 .with('google:gemini-2.5-flash')
14 .prompt('Process the input and provide a structured response: {{input.content}}')
15
16const result = await workflow.run({ content: 'Your content here...' })

Best Use Cases

  • Video/Audio Processing: Gemini 3 Pro supports native video and audio
  • Long Documents: 1M+ context for entire codebases or books
  • Cost Optimization: Gemini 2.5 Flash-Lite at $0.02/1M input tokens

xAI (Grok)

xAI's Grok 4 models feature 256K context windows and advanced reasoning with planning capabilities. Grok 4 excels at real-time knowledge tasks.

Available Models

  • grok-4 - Latest flagship with reasoning and planning (256K context)
  • grok-4-fast - Faster variant of Grok 4
  • grok-3 - Previous generation flagship
  • grok-3-mini - Smaller, faster Grok for quick tasks

Configuration

xai-workflow.ts
1import { relay } from '@relayplane/sdk'
2
3// Configure your provider (do this once at app startup)
4relay.configure({
5 providers: {
6 xai: { apiKey: process.env.XAI_API_KEY! }
7 }
8})
9
10const workflow = relay
11 .workflow('xai-example')
12 .step('research')
13 .with('xai:grok-4')
14 .prompt('Research the topic and provide current information: {{input.topic}}')
15
16const result = await workflow.run({ topic: 'Latest developments in AI' })

Best Use Cases

  • Real-time Information: Current events, trending topics, live data
  • Reasoning Tasks: Complex planning and analysis
  • Creative Content: Less restrictive content generation

Perplexity

Perplexity's Sonar models combine language understanding with real-time web search capabilities, making them ideal for research and information retrieval tasks.

Available Models

  • sonar-pro - Most capable Sonar model with advanced search
  • sonar - Balanced performance and cost
  • sonar-reasoning-pro - Enhanced reasoning with web search
  • sonar-reasoning - Reasoning capabilities with search
  • sonar-deep-research - Comprehensive research mode

Configuration

perplexity-workflow.ts
1import { relay } from '@relayplane/sdk'
2
3// Configure your provider (do this once at app startup)
4relay.configure({
5 providers: {
6 perplexity: { apiKey: process.env.PERPLEXITY_API_KEY! }
7 }
8})
9
10const workflow = relay
11 .workflow('perplexity-example')
12 .step('research')
13 .with('perplexity:sonar-pro')
14 .prompt('Research the following topic with current information: {{input.topic}}')
15
16const result = await workflow.run({ topic: 'Latest AI developments 2025' })

Best Use Cases

  • Real-time Research: Topics requiring up-to-date web information
  • Fact Checking: Verify claims against current sources
  • Competitive Analysis: Market research with citations
Perplexity models include citations in their responses, making them excellent for research tasks where source verification matters.

Local (Ollama)

Run models locally using Ollama for privacy-sensitive data, offline operation, or cost optimization on high-volume tasks.

Use the local: prefix for all Ollama models. The prefix refers to the Ollama server running locally on your machine. Example: local:llama3.3, local:mistral

Setup Instructions

First, install Ollama and pull your desired model:

terminal
1# Install Ollama (macOS/Linux)
2curl -fsSL https://ollama.com/install.sh | sh
3
4# Pull models
5ollama pull llama3.3
6ollama pull qwen2.5
7ollama pull deepseek-r1
8ollama pull mistral
9
10# Start Ollama server (runs on http://localhost:11434)
11ollama serve

Available Models

  • llama3.3 - Meta's latest 70B open model with vision
  • llama3.2 - Multimodal open-source model
  • qwen2.5 - Alibaba's powerful open model
  • deepseek-r1 - Advanced reasoning model
  • mistral - Fast and efficient for many tasks

Configuration

local-workflow.ts
1import { relay } from '@relayplane/sdk'
2
3// No API key needed for local models, but you can set custom endpoint
4relay.configure({
5 providers: {
6 local: { baseUrl: 'http://localhost:11434' } // Optional: customize Ollama endpoint
7 }
8})
9
10const workflow = relay
11 .workflow('local-example')
12 .step('process')
13 .with('local:llama3.3')
14 .prompt('Process the sensitive data locally: {{input.data}}')
15
16const result = await workflow.run({ data: sensitiveData })
Local models are typically slower than cloud APIs. For production workloads requiring low latency, consider cloud providers or dedicated GPU infrastructure.

Multi-Provider Configuration

RelayPlane makes it easy to use multiple providers in a single workflow, allowing you to leverage the strengths of each model.

Multi-Provider Workflow Example

multi-provider-workflow.ts
1import { relay } from '@relayplane/sdk'
2
3// Configure all providers once at app startup
4relay.configure({
5 providers: {
6 openai: { apiKey: process.env.OPENAI_API_KEY! },
7 anthropic: { apiKey: process.env.ANTHROPIC_API_KEY! }
8 }
9})
10
11const multiProviderWorkflow = relay
12 .workflow('multi-provider-analysis')
13
14 // Step 1: Fast initial processing with GPT-5-nano
15 .step('extract')
16 .with('openai:gpt-5-nano')
17 .prompt('Extract key data points from the document: {{input.document}}')
18
19 // Step 2: Deep analysis with Claude Sonnet 4.5
20 .step('analyze')
21 .with('anthropic:claude-sonnet-4-5-20250929')
22 .depends('extract')
23 .prompt('Perform deep analysis on the extracted data: {{extract.output}}')
24
25 // Step 3: Generate report with GPT-5.2
26 .step('report')
27 .with('openai:gpt-5.2')
28 .depends('analyze')
29 .prompt('Generate a professional report based on: {{analyze.output}}')
30
31 // Step 4: Privacy-sensitive summary with local model
32 .step('internal-summary')
33 .with('local:llama3.3')
34 .depends('analyze')
35 .prompt('Create an internal summary with confidential notes based on: {{analyze.output}}')
36
37const result = await multiProviderWorkflow.run({ document: documentContent })
Combine providers strategically: use fast/cheap models for preprocessing, powerful models for complex analysis, and local models for sensitive data.

Cost Comparison

Understanding pricing helps you optimize workflows for cost-effectiveness. Prices are per 1M tokens (as of December 2025).

Token Pricing by Provider

ProviderModelInput ($/1M)Output ($/1M)
OpenAIgpt-5.2$5.00$20.00
OpenAIgpt-5.2-pro$10.00$40.00
OpenAIgpt-5-mini$1.00$4.00
OpenAIgpt-5-nano$0.25$1.00
OpenAIgpt-4.1$2.00$8.00
Anthropicclaude-opus-4.5$15.00$75.00
Anthropicclaude-sonnet-4.5$3.00$15.00
Anthropicclaude-3.5-haiku$0.80$4.00
Googlegemini-3-pro$1.25$5.00
Googlegemini-2.5-flash$0.075$0.30
xAIgrok-4$3.00$15.00
Localllama3.3/qwen2.5$0.00$0.00

Model Capability Matrix

ModelContext WindowVisionSpeed
gpt-5.21MYesFast
gpt-5.2-pro1MYesMedium
gpt-4.11MYesFast
claude-sonnet-4.5200kYesFast
claude-opus-4.5200kYesSlow
gemini-3-pro1MYesMedium
gemini-2.5-flash1MYesFast
grok-4256kYesFast
llama3.3128kYesVaries
Prices are subject to change. Check provider websites for current pricing. Local models have no API costs but require compute resources.

Next Steps