Anthropic cap June 15, are you covered?
ANTHROPIC CAP, JUNE 15

Your Claude calls will start failing June 15
unless you have a fallback.

Anthropic capped programmatic Claude usage on Max subscriptions starting June 15. When you hit the cap, requests return 429. RelayPlane routes the overflow to Gemini 2.5 Pro, GPT-4o, or local Ollama, silently, so your agents keep running.

Three steps. Thirty seconds.

No SDK swap, no rewrite. One env var pointed at the proxy, one config block with a fallback key.

1. Install the proxy

npm install -g @relayplane/proxy, then point your agent at http://localhost:4100 with one env var. Your Anthropic SDK calls keep working unchanged.

2. Add a fallback key

Drop an OpenAI, Google, or local Ollama endpoint into the credential pool in relayplane.config.yml. Pick any combo. Most teams pair Claude with Gemini 2.5 Pro for cost or GPT-4o for parity.

3. Cap hits, traffic auto-routes

When Anthropic returns a 429 or the proxy sees your usage approach the cap, RelayPlane fails over to the next credential. Your agent loop never sees the error. No alerts to wire, no retry logic to write.

The config, in full

This is the entire credential-pool block. Primary Claude, with Gemini 2.5 Pro, GPT-4o, and local Ollama as fallbacks in order.

# relayplane.config.yml
providers:
  anthropic:
    credentials:
      - key: ${ANTHROPIC_KEY}
        model: claude-sonnet-4-6
        priority: 1
    quota:
      soft_limit: 0.90   # fail over at 90% of the June 15 cap

  fallbacks:
    - provider: google
      key: ${GOOGLE_API_KEY}
      model: gemini-2.5-pro
      priority: 2

    - provider: openai
      key: ${OPENAI_API_KEY}
      model: gpt-4o
      priority: 3

    - provider: ollama
      endpoint: http://localhost:11434
      model: llama3.1:70b
      priority: 4         # last resort, runs on your hardware

Priority order is the routing order on cap or 429. Drop the providers you do not have. Free tier covers the credential pool and quota-aware routing.

What the fallback actually does for your agents

  • Gemini 2.5 Pro takes over on long-context or reasoning tasks. Lower cost per token than Claude, comparable quality on most agent workloads.
  • GPT-4o handles the parity case when your prompts depend on tool-use or vision and you need a drop-in for Claude Sonnet behavior.
  • Local Ollama is the air-gap last resort. Llama 3.1 70B or Qwen 2.5 on your own hardware. Free per request, keeps agents alive through total API outages.
  • Zero code changes. Your agent keeps calling the Anthropic SDK. The proxy rewrites the request to the fallback provider's API shape and returns an Anthropic-shaped response.
MIT licensed|npm: @relayplane/proxy|local by default|keys stay on your infra

Cover the June 15 cap before it hits.

Install RelayPlane, add a fallback key, and the cap becomes a non-event. No code changes required.

npm install -g @relayplane/proxy

Free tier covers credential pool plus quota routing. No credit card.