Deployment Guide
Learn how to deploy RelayPlane workflows to production across different environments and platforms.
Local Development
Before deploying to production, you'll want to develop and test workflows locally. RelayPlane runs entirely on your machine with no external dependencies required.
Environment Setup
Create a .env file in your project root to store API keys and configuration:
1# AI Provider API Keys2OPENAI_API_KEY=sk-...3ANTHROPIC_API_KEY=sk-ant-...4GOOGLE_AI_API_KEY=...56# Optional: RelayPlane Pro for telemetry7RELAYPLANE_API_KEY=rp-...89# Environment10NODE_ENV=development1112# Logging13LOG_LEVEL=debugNever commit your .env file to version control. Add it to your .gitignore file immediately after creating it.
Running Workflows Locally
Use tsx or ts-node to run your workflows during development:
1# Install dependencies2npm install34# Run a workflow file directly5npx tsx src/workflows/my-workflow.ts67# Or use ts-node8npx ts-node --esm src/workflows/my-workflow.ts910# With environment variables11npx tsx --env-file=.env src/workflows/my-workflow.tsDebugging with Console and Metadata
Use console.log and workflow metadata to debug issues:
1import { relay } from "@relayplane/workflows";23const result = await relay4 .workflow("debug-example")5 .step("analyze", {6 systemPrompt: "Analyze the input data",7 })8 .with("openai:gpt-4o")9 .run({10 data: "test input",11 // Enable debug metadata12 __debug: true13 });1415// Log full result structure16console.log("Result:", JSON.stringify(result, null, 2));1718// Access step-specific outputs19console.log("Analyze output:", result.steps.analyze);2021// Check execution metadata22console.log("Duration:", result.metadata?.duration);23console.log("Token usage:", result.metadata?.tokenUsage);Pro tip: Set LOG_LEVEL=debug in your environment to see detailed execution traces including prompts, model responses, and timing information.
Production Deployment
When deploying to production, security and reliability become critical. Follow these guidelines to ensure safe operation.
Environment Variables
Store all sensitive configuration in environment variables, never in code:
1// Validate required environment variables at startup2function getRequiredEnv(key: string): string {3 const value = process.env[key];4 if (!value) {5 throw new Error(`Missing required environment variable: ${key}`);6 }7 return value;8}910export const config = {11 // AI Providers12 openaiApiKey: getRequiredEnv("OPENAI_API_KEY"),13 anthropicApiKey: process.env.ANTHROPIC_API_KEY, // Optional1415 // RelayPlane16 relayplaneApiKey: process.env.RELAYPLANE_API_KEY,1718 // App config19 nodeEnv: process.env.NODE_ENV || "production",20 logLevel: process.env.LOG_LEVEL || "info",21};Security Considerations
- •Never commit API keys - Use .gitignore and secret management services
- •Rotate keys regularly - Set calendar reminders to rotate API keys quarterly
- •Use least privilege - Create API keys with minimal required permissions
- •Audit access logs - Monitor who accesses your AI provider accounts
- •Set spending limits - Configure billing alerts and hard caps with your AI providers
API key exposure is a critical security incident. If you accidentally commit API keys, rotate them immediately and check your provider's usage logs for unauthorized access.
Logging Setup
Configure structured logging for production monitoring:
1import pino from "pino";23export const logger = pino({4 level: process.env.LOG_LEVEL || "info",5 formatters: {6 level: (label) => ({ level: label }),7 },8 timestamp: pino.stdTimeFunctions.isoTime,9 // Add request context10 mixin() {11 return {12 service: "relayplane-workflows",13 environment: process.env.NODE_ENV,14 };15 },16});1718// Usage in workflows19logger.info({ workflowName: "invoice-processor", duration: 1234 }, "Workflow completed");20logger.error({ error: err.message, stack: err.stack }, "Workflow failed");Docker Deployment
Docker provides consistent environments across development and production. Here's a production-ready setup.
Example Dockerfile
1# Build stage2FROM node:20-alpine AS builder34WORKDIR /app56# Copy package files7COPY package*.json ./8COPY tsconfig.json ./910# Install dependencies11RUN npm ci --only=production1213# Copy source code14COPY src ./src1516# Build TypeScript17RUN npm run build1819# Production stage20FROM node:20-alpine AS runner2122WORKDIR /app2324# Create non-root user for security25RUN addgroup --system --gid 1001 nodejs26RUN adduser --system --uid 1001 relayplane27USER relayplane2829# Copy built application30COPY --from=builder --chown=relayplane:nodejs /app/dist ./dist31COPY --from=builder --chown=relayplane:nodejs /app/node_modules ./node_modules32COPY --from=builder --chown=relayplane:nodejs /app/package.json ./3334# Set environment35ENV NODE_ENV=production3637# Health check38HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \39 CMD node -e "console.log('healthy')" || exit 14041CMD ["node", "dist/index.js"]Docker Compose for Multi-Service
Use Docker Compose to orchestrate workflows with databases and queues:
1version: '3.8'23services:4 workflows:5 build: .6 environment:7 - NODE_ENV=production8 - OPENAI_API_KEY=${OPENAI_API_KEY}9 - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}10 - REDIS_URL=redis://redis:637911 - DATABASE_URL=postgresql://postgres:password@db:5432/relayplane12 depends_on:13 - redis14 - db15 restart: unless-stopped16 deploy:17 resources:18 limits:19 memory: 1G20 cpus: '1'2122 redis:23 image: redis:7-alpine24 volumes:25 - redis_data:/data26 restart: unless-stopped2728 db:29 image: postgres:15-alpine30 environment:31 POSTGRES_DB: relayplane32 POSTGRES_PASSWORD: password33 volumes:34 - postgres_data:/var/lib/postgresql/data35 restart: unless-stopped3637volumes:38 redis_data:39 postgres_data:Container Best Practices
- •Use multi-stage builds - Reduce image size by separating build and runtime
- •Run as non-root - Create dedicated users for better security
- •Pin versions - Use specific image tags, not
latest - •Add health checks - Enable orchestrators to detect unhealthy containers
- •Set resource limits - Prevent runaway containers from affecting other services
Serverless (Lambda/Vercel)
Serverless platforms are great for sporadic workflow execution but require special considerations for AI workloads.
Cold Start Considerations
Cold starts can add 500ms-2s to workflow execution. Minimize impact with these strategies:
1import { relay } from "@relayplane/workflows";23// Initialize outside handler to persist across invocations4const workflowConfig = {5 // Pre-configure providers to reduce cold start time6 providers: {7 openai: { apiKey: process.env.OPENAI_API_KEY },8 anthropic: { apiKey: process.env.ANTHROPIC_API_KEY },9 },10};1112// Lambda handler13export async function handler(event: APIGatewayEvent) {14 const input = JSON.parse(event.body || "{}");1516 const result = await relay17 .workflow("serverless-workflow")18 .step("process")19 .with("openai:gpt-4o-mini") // Use faster models for serverless20 .run(input);2122 return {23 statusCode: 200,24 body: JSON.stringify(result),25 };26}Timeout Configuration
AI API calls are slow! LLM responses typically take 2-30 seconds depending on the model and prompt complexity. Configure generous timeouts to avoid premature termination.
1{2 "functions": {3 "api/workflow.ts": {4 "maxDuration": 60,5 "memory": 10246 }7 }8}1functions:2 processWorkflow:3 handler: src/api/workflow.handler4 timeout: 60 # 60 seconds for AI calls5 memorySize: 10246 environment:7 OPENAI_API_KEY: ${env:OPENAI_API_KEY}8 ANTHROPIC_API_KEY: ${env:ANTHROPIC_API_KEY}Edge Functions Limitations
Edge functions (Vercel Edge, Cloudflare Workers) have restrictions that affect AI workflows:
- •CPU time limits - Edge functions typically limit to 50ms-30s CPU time
- •No Node.js APIs - Edge runtime lacks fs, child_process, and other Node modules
- •Memory constraints - Limited to 128MB in some platforms
- •Use serverless instead - For AI workflows, prefer traditional serverless functions
Recommendation: Use edge functions only for lightweight tasks like request routing. Run actual AI workflows in serverless functions or containers.
Monitoring & Observability
Production workflows need comprehensive monitoring to track performance, costs, and errors.
Telemetry Integration with RelayPlane Pro
RelayPlane Pro provides built-in telemetry for all your workflows:
1import { relay } from "@relayplane/workflows";23// Enable telemetry with your API key4relay.configure({5 apiKey: process.env.RELAYPLANE_API_KEY,6 telemetry: {7 enabled: true,8 // Send traces for debugging9 traces: true,10 // Track token usage for cost monitoring11 tokenUsage: true,12 // Include custom metadata13 metadata: {14 environment: process.env.NODE_ENV,15 version: process.env.APP_VERSION,16 },17 },18});1920// Telemetry is automatically collected for all workflows21const result = await relay22 .workflow("monitored-workflow")23 .step("process")24 .with("openai:gpt-4o")25 .run(input);2627// View traces and metrics at dashboard.relayplane.comKey Metrics to Track
Monitor these metrics for healthy production workflows:
- •Execution duration - Track p50, p95, p99 latencies per workflow
- •Token usage - Monitor input/output tokens per model for cost control
- •Error rate - Track failures by workflow, step, and error type
- •Throughput - Workflows executed per minute/hour
- •Cost per workflow - Calculate from token usage and provider pricing
- •Rate limit hits - Track when you're hitting provider limits
1import { metrics } from "@opentelemetry/api";23const meter = metrics.getMeter("relayplane-workflows");45// Create metrics6const workflowDuration = meter.createHistogram("workflow.duration", {7 description: "Workflow execution duration in milliseconds",8 unit: "ms",9});1011const tokenUsage = meter.createCounter("workflow.tokens", {12 description: "Total tokens used",13});1415const workflowErrors = meter.createCounter("workflow.errors", {16 description: "Total workflow errors",17});1819// Record metrics after workflow execution20export function recordWorkflowMetrics(result: WorkflowResult) {21 workflowDuration.record(result.metadata.duration, {22 workflow: result.workflowName,23 });2425 tokenUsage.add(result.metadata.tokenUsage.total, {26 workflow: result.workflowName,27 model: result.metadata.model,28 });2930 if (result.error) {31 workflowErrors.add(1, {32 workflow: result.workflowName,33 errorType: result.error.name,34 });35 }36}Alerting Recommendations
Set up alerts for these critical conditions:
- •Error rate > 5% - Investigate immediately, may indicate model issues
- •p95 latency > 30s - Workflows taking too long, consider optimization
- •Daily cost > budget - Prevent unexpected billing surprises
- •Rate limits hit - Need to implement queuing or request higher limits
- •Token usage spike - May indicate prompt injection or infinite loops
Scaling Considerations
As workflow volume grows, you'll need strategies to handle load while respecting AI provider limits.
Rate Limiting Strategies
AI providers enforce rate limits on requests per minute (RPM) and tokens per minute (TPM). Implement client-side limiting:
1import Bottleneck from "bottleneck";23// Create limiters for each provider4const openaiLimiter = new Bottleneck({5 maxConcurrent: 10, // Max parallel requests6 minTime: 100, // Min 100ms between requests (600 RPM)7});89const anthropicLimiter = new Bottleneck({10 maxConcurrent: 5,11 minTime: 200, // 300 RPM12});1314// Wrap workflow execution with rate limiting15export async function executeWithRateLimit(16 provider: string,17 fn: () => Promise 18) {19 const limiter = provider === "openai" ? openaiLimiter : anthropicLimiter;20 return limiter.schedule(fn);21}2223// Usage24const result = await executeWithRateLimit("openai", () =>25 relay.workflow("my-workflow").step("process").with("openai:gpt-4o").run(input)26);Queue-Based Processing
For high-volume workloads, use a queue to decouple ingestion from processing:
1import { Queue, Worker } from "bullmq";2import { relay } from "@relayplane/workflows";34// Create queue5const workflowQueue = new Queue("workflows", {6 connection: { host: "localhost", port: 6379 },7 defaultJobOptions: {8 attempts: 3,9 backoff: {10 type: "exponential",11 delay: 5000,12 },13 },14});1516// Add jobs to queue17export async function enqueueWorkflow(18 workflowName: string,19 input: Record 20) {21 await workflowQueue.add(workflowName, { workflowName, input });22}2324// Process jobs with concurrency control25const worker = new Worker(26 "workflows",27 async (job) => {28 const { workflowName, input } = job.data;2930 const result = await relay31 .workflow(workflowName)32 .step("process")33 .with("openai:gpt-4o")34 .run(input);3536 return result;37 },38 {39 connection: { host: "localhost", port: 6379 },40 concurrency: 5, // Process 5 workflows in parallel41 }42);4344worker.on("completed", (job) => {45 console.log(`Workflow ${job.id} completed`);46});4748worker.on("failed", (job, err) => {49 console.error(`Workflow ${job?.id} failed:`, err);50});Batch Workflows
Process multiple items efficiently with batch workflows:
1import { relay } from "@relayplane/workflows";23// Batch process items with controlled concurrency4async function processBatch(items: string[], batchSize = 10) {5 const results = [];67 // Process in batches8 for (let i = 0; i < items.length; i += batchSize) {9 const batch = items.slice(i, i + batchSize);1011 // Process batch items in parallel12 const batchResults = await Promise.all(13 batch.map((item) =>14 relay15 .workflow("batch-processor")16 .step("process")17 .with("openai:gpt-4o-mini") // Use faster model for batches18 .run({ item })19 )20 );2122 results.push(...batchResults);2324 // Optional: Add delay between batches to stay under rate limits25 if (i + batchSize < items.length) {26 await new Promise((resolve) => setTimeout(resolve, 1000));27 }28 }2930 return results;31}3233// Usage34const items = ["item1", "item2", "item3", /* ... hundreds more */];35const results = await processBatch(items, 10);Cost optimization: For batch processing, use gpt-4o-mini or claude-3-haiku which are 10-20x cheaper than larger models. Reserve powerful models for complex tasks.
Next Steps
- •Quickstart Guide - Get your first workflow running in minutes
- •AI Providers - Configure OpenAI, Anthropic, and other providers
- •Invoice Processor Example - See a complete production workflow
- •API Reference - Full documentation for the workflow API