Pillar 1 · ObserveShipped
See everything, per agent.
Every LLM request flows through the proxy with full attribution. Cost, model, task type, tokens, latency, all live, all namespaced. Per-agent breakdown uses the system-prompt fingerprint, no annotation work required.
- Per-tenant and per-agent cost tracking
- Cache-aware accounting (Anthropic prompt caching)
- Tamper-proof, exportable audit trail
- 7 days free, 30 days Starter, 90 days Pro, unlimited Max
Pillar 2 · GovernShipping in pieces
Hard budget caps. Anomaly detection.
Daily, hourly, and per-request budget caps with block, downgrade, warn, or alert actions. Velocity spikes, repetition loops, and token explosions are detected in a sliding window. Multi-tenant kill-switch endpoint is next on the roadmap.
- Budget caps, configurable action per breach (live)
- Anomaly detection across the 100-req window (live)
- Credential pool, round-robin across keys (live)
- Quota-aware fail-over before 429s (live, Pro tier)
- Tenant pause via HTTP endpoint (spec, next 2 weeks)
Pillar 3 · VerifyOn the roadmap
Spec-match before it ships.
Before an agent marks a task done, RelayPlane will score the diff and acceptance criteria with a cheap judge model. Failing tasks retry. Today this lives as a separate evaluator in the clawd pipeline. Moving it into RelayPlane is on deck.
- Per-criterion pass / fail with evidence
- Blocker, major, minor severity weighting
- Judge model configurable, Haiku by default
- Currently runs in clawd, RP integration on the roadmap