Explainability
Human-readable explanations for every decision in every run.
Overview
Every request through RelayPlane generates a decision chain that can be reconstructed into a human-readable explanation. This is essential for:
- Debugging — Why did a request fail or behave unexpectedly?
- Compliance — Audit trail of all decisions
- Learning — Understanding cost and performance patterns
- Trust — AI agents explain their decisions to humans
Getting an Explanation
1curl http://localhost:3001/v1/runs/run_xyz789/explain23{4 "run_id": "run_xyz789",5 "narrative": "Request allowed. Used claude-3-5-sonnet via anthropic (primary choice). Cost: $0.015. Latency: 2.5s.",6 "timeline": [7 {8 "stage": "auth",9 "outcome": "passed",10 "timestamp": "2026-02-06T12:00:00.005Z",11 "detail": "API auth verified. Agent: support-bot, Workspace: ws_production"12 },13 {14 "stage": "policy",15 "outcome": "passed",16 "timestamp": "2026-02-06T12:00:00.010Z",17 "detail": "3 policies evaluated. Daily budget: $15.00/$50.00 used (30%). All passed."18 },19 {20 "stage": "routing",21 "outcome": "selected",22 "timestamp": "2026-02-06T12:00:00.015Z",23 "detail": "claude-3-5-sonnet selected. Matched capabilities: [chat, tool_use]. Score: 0.92. Fallback available: gpt-4o"24 },25 {26 "stage": "provider",27 "outcome": "success",28 "timestamp": "2026-02-06T12:00:02.500Z",29 "detail": "anthropic responded in 2.5s. 150 tokens. $0.015."30 }31 ],32 "insights": [33 "This run used 30% of daily budget",34 "Primary model succeeded on first try",35 "Latency within normal range for claude-3-5-sonnet"36 ]37}Decision Timeline
Each run has a timeline of decision stages:
| Stage | Outcomes | Description |
|---|---|---|
auth | passed, failed | Authentication and authorization check |
policy | passed, failed, warned | Policy evaluation (budgets, allowlists, etc.) |
routing | selected, fallback | Model and provider selection |
provider | success, error, timeout | Provider request and response |
Explaining Failed Runs
1curl http://localhost:3001/v1/runs/run_failed123/explain23{4 "run_id": "run_failed123",5 "narrative": "Request blocked by policy. Daily budget exceeded ($52.00/$50.00).",6 "timeline": [7 {8 "stage": "auth",9 "outcome": "passed",10 "detail": "API auth verified"11 },12 {13 "stage": "policy",14 "outcome": "failed",15 "detail": "Blocked by 'Daily Budget Cap'. Current spend: $52.00, Limit: $50.00"16 }17 ],18 "blocking_decision": {19 "type": "policy",20 "policy_name": "Daily Budget Cap",21 "policy_type": "budget.per_day",22 "reason": "Budget exhausted"23 },24 "suggestions": [25 "Increase daily budget limit in workspace settings",26 "Wait until tomorrow for budget reset",27 "Use a lower-cost model for this request"28 ]29}Comparing Runs
Compare two runs to understand differences:
1curl -X POST http://localhost:3001/v1/runs/compare \2 -H "Content-Type: application/json" \3 -d '{"run_ids": ["run_123", "run_456"]}'45{6 "runs": ["run_123", "run_456"],7 "differences": [8 {9 "field": "model",10 "run_123": "claude-3-5-sonnet",11 "run_456": "gpt-4o",12 "type": "value_changed"13 },14 {15 "field": "latency_ms",16 "run_123": 2500,17 "run_456": 4200,18 "type": "value_changed",19 "delta": "+68%"20 },21 {22 "field": "cost_usd",23 "run_123": 0.015,24 "run_456": 0.025,25 "type": "value_changed",26 "delta": "+67%"27 }28 ],29 "summary": "run_456 used gpt-4o instead of claude-3-5-sonnet, resulting in 68% higher latency and 67% higher cost"30}Simulation
Test what would happen without making a real request:
1curl -X POST http://localhost:3001/v1/simulate \2 -H "Content-Type: application/json" \3 -d '{4 "workspace_id": "ws_123",5 "agent_id": "agent_456",6 "model": "claude-3-opus",7 "estimated_tokens": 500008 }'910{11 "would_succeed": false,12 "simulated_decisions": [13 {14 "stage": "policy",15 "outcome": "would_fail",16 "detail": "model.allowlist: claude-3-opus not in approved models"17 }18 ],19 "estimated_cost": 2.50,20 "recommendations": [21 "Use claude-3-5-sonnet instead (approved and 85% cheaper)",22 "Request model approval from workspace admin"23 ]24}Simulation is perfect for pre-flight checks before expensive operations.
Debugging Tips
- Check the timeline — Most issues are visible in the decision timeline
- Compare with working runs — Use run comparison to spot differences
- Check policy order — Policies evaluate in priority order
- Verify provider health — Check if provider was degraded when run failed