The Prompt Router that doubles your token runway.

Weave Router routes each prompt to the best quality-per-token model, right inside Codex, Claude, and Cursor.

$ weave route --file hero-cta.md [weave-router-model: kimi-k2-0905]* Typing · 0.1sWeaveClaudeOpus 4.7COST$$$$Q · 94$ · 1864ClaudeSonnet 4.7COST$$Q · 86$ · 5574GPT-5Codex 5.5COST$$$Q · 88$ · 4270GPT-5miniCOST$Q · 78$ · 8674Gemini2.5 ProCOST$$$Q · 84$ · 4870KimiK2COST$Q · 82$ · 9694DeepSeekV3COST$Q · 78$ · 8874Llama4 405BCOST$$Q · 80$ · 7074
>
>
$ weave route --file hero-cta.md [weave-router-model: kimi-k2-0905]* Typing · 0.1sWeaveClaudeOpus 4.7COST$$$$Q · 94$ · 1864ClaudeSonnet 4.7COST$$Q · 86$ · 5574GPT-5Codex 5.5COST$$$Q · 88$ · 4270GPT-5miniCOST$Q · 78$ · 8674Gemini2.5 ProCOST$$$Q · 84$ · 4870KimiK2COST$Q · 82$ · 9694DeepSeekV3COST$Q · 78$ · 8874Llama4 405BCOST$$Q · 80$ · 7074
>
>
Trusted by leading engineering teams
Trusted by leading engineering teams
Trusted by leading engineering teams

Scale-Ups

Startups

Enterprise

Scale-Ups

Startups

Enterprise

How to get started

The procedure below is exhaustive. It assumes node ≥ 18 on a Unix-like shell and at least one supported client (Claude Code, Cursor, or Codex) installed in PATH.

  1. 01.

    Get your router key

    Generate a bearer token. The token never leaves your device unless you choose to export it.

  2. 02.

    Run one command

    The installer detects your clients and points them at the router. One env var per provider, written automatically. Anthropic, OpenAI, and Google.

    $npx -y @workweave/router
  3. 03.

    Use Claude Code or Cursor as usual

    Each request is now routed to the model with the best quality-per-token. The client is unaware of the proxy.

    routed via weave

See how far your tokens really go.

Drag the slider to your current monthly token usage. The same budget on the router covers roughly twice the work delivered.

10Btokens / month1.95×
ConfigurationToken mileageWork delivered
Without Weave Router
10B tokens
With Weave Router
20B tokens · 1.95×
Hard requests stay on frontier modelsEasy requests run on open-source at parity quality

Scale-Ups

Startups

Enterprise

Frequently asked questions

The routing layer adds only low single-digit milliseconds of overhead on top of the upstream call, since prompts are embedded with a small in-process ONNX model and scored against frozen cluster centroids.
Weave Router works across Anthropic, OpenAI, Google Gemini, and OpenRouter for all open-source models
On the managed service, you get a single unified usage-based balance in credits against your Weave account, with no juggling of separate Anthropic, OpenAI, OpenRouter, and Gemini invoices, while self-hosted deployments use your own provider keys and bill directly with each provider.
Weave is a zero-retention proxy by default. Prompts and completions are not stored on our infrastructure, only routing metadata (classification label, chosen model, latency, cost). SOC 2 Type II is available on request under NDA.
Model selection is cache aware so we only switch if it's worth the cost/loss of cache!

Source code available. Start taking your token budget twice as far.

Get started in 5 minutes or book a demo with our team.

npx @workweave/router
Works withCursorClaude Code
Open source
npx @workweave/router
Works withCursorClaude Code
Open source
npx @workweave/router
Works withCursorClaude Code
Open source

The engineering intelligence platform for the AI era.

Trusted by engineering teams from seed stage to Fortune 500