Savings

How RelayPlane measures the money you save through intelligent model routing.

How Savings Are Calculated

Savings are computed by comparing what you actually paid versus what you would have paid if every request went to the most expensive model. The baseline model is Claude Opus 4 ($15/1M input, $75/1M output).

1// For each request:
2const baselineCost = estimateCost("claude-opus-4-6", tokensIn, tokensOut);
3const actualCost = estimateCost(actualModel, tokensIn, tokensOut);
4const saved = Math.max(0, baselineCost - actualCost);
5
6// Overall savings percentage:
7const savingsPercent = (totalSaved / totalBaselineCost) * 100;

Example

A typical coding session with 100 requests might break down like:

  • 70 simple tasks → routed to Sonnet ($3/$15 per 1M) instead of Opus ($15/$75)
  • 20 moderate tasks → routed to Sonnet
  • 10 complex tasks → routed to Opus (no savings, but quality preserved)

Result: ~60-80% cost reduction on the 90 non-complex tasks while maintaining full quality on the 10 complex ones.

Viewing Savings

Check your savings via the API:

1curl http://localhost:4100/v1/telemetry/savings

Response:

1{
2 "total": 12.5400,
3 "actualCost": 3.2100,
4 "savings": 9.3300,
5 "savedAmount": 9.3300,
6 "percentage": 74,
7 "byDay": [
8 {
9 "date": "2025-02-24",
10 "savedAmount": 4.1200,
11 "originalCost": 5.8900,
12 "actualCost": 1.7700
13 }
14 ]
15}

Per-Request Savings

Each run in the /v1/telemetry/runs endpoint includes per-request savings:

1{
2 "model": "claude-sonnet-4-6",
3 "original_model": "claude-opus-4-6",
4 "costUsd": 0.0042,
5 "savings": 0.0168,
6 "complexity": "simple"
7}

Local Stats Summary

The telemetry module also provides a local stats summary with per-model and per-task-type breakdowns:

  • totalCost — What you actually spent
  • baselineCost — What you would have spent on Opus
  • savings — The difference
  • savingsPercent — Percentage saved
  • byModel — Breakdown per model with count, cost, and baseline
  • byTaskType — Breakdown per inferred task type
Visit http://localhost:4100/dashboard for a real-time visual savings tracker that updates every 5 seconds.