local-first AI dispatch — a serverless router that sends each request to the cheapest capable model, your infra + your keys.
prompt
│
▼ embed @ edge · Workers AI (bge-base-en)
classify ─▶ {route · confidence · margin}
│
├─ local_code ┐
├─ local_search │ $0 · runs on
├─ local_summarize ┤ YOUR infra + keys
├─ direct │
└─ reason ┘
│
└─ low confidence ─▶ escalate to your
frontier model (Claude · GPT · Gemini)
BYOK — your API keys + data + inference stay on your infra. We only ever return a routing decision.