Savings
How RelayPlane measures the money you save through intelligent model routing.
How Savings Are Calculated
Savings are computed by comparing what you actually paid versus what you would have paid if every request went to the most expensive model. The baseline model is Claude Opus 4 ($15/1M input, $75/1M output).
1// For each request:2const baselineCost = estimateCost("claude-opus-4-6", tokensIn, tokensOut);3const actualCost = estimateCost(actualModel, tokensIn, tokensOut);4const saved = Math.max(0, baselineCost - actualCost);56// Overall savings percentage:7const savingsPercent = (totalSaved / totalBaselineCost) * 100;Example
A typical coding session with 100 requests might break down like:
- 70 simple tasks → routed to Sonnet ($3/$15 per 1M) instead of Opus ($15/$75)
- 20 moderate tasks → routed to Sonnet
- 10 complex tasks → routed to Opus (no savings, but quality preserved)
Result: ~60-80% cost reduction on the 90 non-complex tasks while maintaining full quality on the 10 complex ones.
Viewing Savings
Check your savings via the API:
1curl http://localhost:4100/v1/telemetry/savingsResponse:
1{2 "total": 12.5400,3 "actualCost": 3.2100,4 "savings": 9.3300,5 "savedAmount": 9.3300,6 "percentage": 74,7 "byDay": [8 {9 "date": "2025-02-24",10 "savedAmount": 4.1200,11 "originalCost": 5.8900,12 "actualCost": 1.770013 }14 ]15}Per-Request Savings
Each run in the /v1/telemetry/runs endpoint includes per-request savings:
1{2 "model": "claude-sonnet-4-6",3 "original_model": "claude-opus-4-6",4 "costUsd": 0.0042,5 "savings": 0.0168,6 "complexity": "simple"7}Local Stats Summary
The telemetry module also provides a local stats summary with per-model and per-task-type breakdowns:
totalCost— What you actually spentbaselineCost— What you would have spent on Opussavings— The differencesavingsPercent— Percentage savedbyModel— Breakdown per model with count, cost, and baselinebyTaskType— Breakdown per inferred task type
Visit
http://localhost:4100/dashboard for a real-time visual savings tracker that updates every 5 seconds.