Explainability

Human-readable explanations for every decision in every run.

Overview

Every request through RelayPlane generates a decision chain that can be reconstructed into a human-readable explanation. This is essential for:

Debugging — Why did a request fail or behave unexpectedly?
Compliance — Audit trail of all decisions
Learning — Understanding cost and performance patterns
Trust — AI agents explain their decisions to humans

Getting an Explanation

1curl http://localhost:3001/v1/runs/run_xyz789/explain
2
3{
4  "run_id": "run_xyz789",
5  "narrative": "Request allowed. Used claude-3-5-sonnet via anthropic (primary choice). Cost: $0.015. Latency: 2.5s.",
6  "timeline": [
7    {
8      "stage": "auth",
9      "outcome": "passed",
10      "timestamp": "2026-02-06T12:00:00.005Z",
11      "detail": "API auth verified. Agent: support-bot, Workspace: ws_production"
12    },
13    {
14      "stage": "policy",
15      "outcome": "passed",
16      "timestamp": "2026-02-06T12:00:00.010Z",
17      "detail": "3 policies evaluated. Daily budget: $15.00/$50.00 used (30%). All passed."
18    },
19    {
20      "stage": "routing",
21      "outcome": "selected",
22      "timestamp": "2026-02-06T12:00:00.015Z",
23      "detail": "claude-3-5-sonnet selected. Matched capabilities: [chat, tool_use]. Score: 0.92. Fallback available: gpt-4o"
24    },
25    {
26      "stage": "provider",
27      "outcome": "success",
28      "timestamp": "2026-02-06T12:00:02.500Z",
29      "detail": "anthropic responded in 2.5s. 150 tokens. $0.015."
30    }
31  ],
32  "insights": [
33    "This run used 30% of daily budget",
34    "Primary model succeeded on first try",
35    "Latency within normal range for claude-3-5-sonnet"
36  ]
37}

Decision Timeline

Each run has a timeline of decision stages:

Stage	Outcomes	Description
`auth`	passed, failed	Authentication and authorization check
`policy`	passed, failed, warned	Policy evaluation (budgets, allowlists, etc.)
`routing`	selected, fallback	Model and provider selection
`provider`	success, error, timeout	Provider request and response

Explaining Failed Runs

1curl http://localhost:3001/v1/runs/run_failed123/explain
2
3{
4  "run_id": "run_failed123",
5  "narrative": "Request blocked by policy. Daily budget exceeded ($52.00/$50.00).",
6  "timeline": [
7    {
8      "stage": "auth",
9      "outcome": "passed",
10      "detail": "API auth verified"
11    },
12    {
13      "stage": "policy",
14      "outcome": "failed",
15      "detail": "Blocked by 'Daily Budget Cap'. Current spend: $52.00, Limit: $50.00"
16    }
17  ],
18  "blocking_decision": {
19    "type": "policy",
20    "policy_name": "Daily Budget Cap",
21    "policy_type": "budget.per_day",
22    "reason": "Budget exhausted"
23  },
24  "suggestions": [
25    "Increase daily budget limit in workspace settings",
26    "Wait until tomorrow for budget reset",
27    "Use a lower-cost model for this request"
28  ]
29}

Comparing Runs

Compare two runs to understand differences:

1curl -X POST http://localhost:3001/v1/runs/compare \
2  -H "Content-Type: application/json" \
3  -d '{"run_ids": ["run_123", "run_456"]}'
4
5{
6  "runs": ["run_123", "run_456"],
7  "differences": [
8    {
9      "field": "model",
10      "run_123": "claude-3-5-sonnet",
11      "run_456": "gpt-4o",
12      "type": "value_changed"
13    },
14    {
15      "field": "latency_ms",
16      "run_123": 2500,
17      "run_456": 4200,
18      "type": "value_changed",
19      "delta": "+68%"
20    },
21    {
22      "field": "cost_usd",
23      "run_123": 0.015,
24      "run_456": 0.025,
25      "type": "value_changed",
26      "delta": "+67%"
27    }
28  ],
29  "summary": "run_456 used gpt-4o instead of claude-3-5-sonnet, resulting in 68% higher latency and 67% higher cost"
30}

Simulation

Test what would happen without making a real request:

1curl -X POST http://localhost:3001/v1/simulate \
2  -H "Content-Type: application/json" \
3  -d '{
4    "workspace_id": "ws_123",
5    "agent_id": "agent_456",
6    "model": "claude-3-opus",
7    "estimated_tokens": 50000
8  }'
9
10{
11  "would_succeed": false,
12  "simulated_decisions": [
13    {
14      "stage": "policy",
15      "outcome": "would_fail",
16      "detail": "model.allowlist: claude-3-opus not in approved models"
17    }
18  ],
19  "estimated_cost": 2.50,
20  "recommendations": [
21    "Use claude-3-5-sonnet instead (approved and 85% cheaper)",
22    "Request model approval from workspace admin"
23  ]
24}

Simulation is perfect for pre-flight checks before expensive operations.

Debugging Tips

Check the timeline — Most issues are visible in the decision timeline
Compare with working runs — Use run comparison to spot differences
Check policy order — Policies evaluate in priority order
Verify provider health — Check if provider was degraded when run failed