Model Fallback
Automatic failover to backup models when your primary model fails.
Overview
Model fallback ensures your workflows keep running even when a provider is down or rate-limited. Chain multiple models together and RelayPlane will automatically try the next one if the previous fails.
Basic Usage
Use the .fallback() method to chain backup models:
1import { relay } from "@relayplane/sdk";23const result = await relay4 .workflow("resilient-analysis")5 .step("analyze")6 .with("openai:gpt-4o")7 .fallback("anthropic:claude-sonnet-4-20250514")8 .fallback("openai:gpt-4o-mini")9 .prompt("Analyze this document: {{input.text}}")10 .run({ text: documentContent });How It Works
The fallback chain operates with retry-then-fallback semantics:
- Try the primary model with configured retries (exponential backoff)
- If all retries fail, move to the first fallback model
- Try that model with its own retry attempts
- Continue through the chain until success or all models exhausted
1Primary: openai:gpt-4o2 ├── Attempt 1 → Rate limit (429)3 ├── Attempt 2 → Rate limit (429)4 └── Attempt 3 → Still failing, move to fallback56Fallback 1: anthropic:claude-sonnet-4-202505147 └── Attempt 1 → Success! ✓89Result: Success (using fallback model)Combining with Retry
Retry settings apply per-model in the chain:
1relay2 .workflow("robust-extraction")3 .step("extract", {4 retry: { maxRetries: 2, backoffMs: 1000 } // Each model gets 2 retries5 })6 .with("openai:gpt-4o")7 .fallback("anthropic:claude-sonnet-4-20250514")8 .fallback("openai:gpt-4o-mini")9 .prompt("Extract data from: {{input.document}}")Each model in the fallback chain gets its own retry attempts. With
maxRetries: 2 and 3 models, you could have up to 9 total attempts before final failure.What Triggers Fallback
Fallback is triggered when a model fails after exhausting its retries:
| Error Type | Behavior |
|---|---|
| Rate limit (429) | Retry with backoff, then fallback |
| Server error (5xx) | Retry with backoff, then fallback |
| Timeout | Immediate fallback (no retry) |
| Provider outage | Retry with backoff, then fallback |
| Auth error (401/403) | Immediate fallback (non-recoverable) |
Cross-Provider Fallback
Fallback models can be from any configured provider:
1relay.configure({2 providers: {3 openai: { apiKey: process.env.OPENAI_API_KEY },4 anthropic: { apiKey: process.env.ANTHROPIC_API_KEY },5 google: { apiKey: process.env.GOOGLE_API_KEY },6 }7});89// Mix providers in your fallback chain10relay11 .workflow("multi-provider")12 .step("generate")13 .with("openai:gpt-4o") // Primary: OpenAI14 .fallback("anthropic:claude-sonnet-4-20250514") // Fallback 1: Anthropic15 .fallback("google:gemini-1.5-pro") // Fallback 2: Google16 .prompt("Generate a summary")Cross-provider fallback provides maximum resilience. If one provider has an outage, your workflow automatically routes to another.
Checking Which Model Was Used
The workflow result includes metadata about fallback usage:
1const result = await workflow.run(input);23// Check if a fallback was used4if (result.metadata.fallbackUsed) {5 console.log("Primary model failed, used:", result.metadata.model);6 console.log("Original model was:", result.metadata.originalModel);7 console.log("Fallback index:", result.metadata.fallbackIndex);8}Best Practices
- •Order by capability: Put your most capable model first, then progressively simpler models as fallbacks
- •Mix providers: Use different providers for true redundancy against outages
- •Consider cost: Fallback to cheaper models (e.g., gpt-4o → gpt-4o-mini) for cost efficiency
- •Test your chain: Verify all fallback models can handle your prompts and schemas
- •Monitor usage: Track which models are being used to detect provider issues early
Complete Example
1import { relay } from "@relayplane/sdk";2import { z } from "zod";34// Configure multiple providers5relay.configure({6 providers: {7 openai: { apiKey: process.env.OPENAI_API_KEY },8 anthropic: { apiKey: process.env.ANTHROPIC_API_KEY },9 }10});1112// Define schema for structured extraction13const SentimentSchema = z.object({14 sentiment: z.enum(["positive", "negative", "neutral"]),15 confidence: z.number().min(0).max(1),16 keywords: z.array(z.string()),17});1819// Build resilient workflow with fallbacks20const result = await relay21 .workflow("sentiment-analysis")22 .step("analyze", {23 schema: SentimentSchema,24 retry: { maxRetries: 2, backoffMs: 500 }25 })26 .with("openai:gpt-4o")27 .fallback("anthropic:claude-sonnet-4-20250514")28 .fallback("openai:gpt-4o-mini")29 .prompt(`Analyze the sentiment of this text:3031Text: {{input.text}}3233Return sentiment (positive/negative/neutral), confidence score, and key phrases.`)34 .run({ text: "I absolutely love this product! Best purchase ever." });3536if (result.success) {37 console.log("Sentiment:", result.finalOutput.sentiment);38 console.log("Confidence:", result.finalOutput.confidence);39} else {40 console.error("All models failed:", result.error?.message);41}See also: Retry Logic | Multi-Provider Support