Plan limits
Full matrix of what each plan ships. Cross-reference with the pricing page for current prices.
| Feature | Free | Pro | Scale |
|---|---|---|---|
| Monthly requests | 10,000 | 50,000 | 500,000 |
| Per-minute burst | 10 | 100 | 1,000 |
| Per-day burst | 500 | 5,000 | 50,000 |
| Allowed models (managed) | gpt-4o-mini | gpt-4o-mini | gpt-4o-mini |
| Allowed models (BYOK) | — | any model on your provider key | any model on your provider key |
| API keys | 1 (primary) | 5 | unlimited |
| Log retention | 7 days | 30 days | 90 days |
| Cache TTL | 24h | 72h | 168h (7d) |
| Semantic vector pool | 1,000 | 10,000 | unlimited |
| Smart routing | basic | priority | priority + custom rules |
| Advanced analytics | — | yes | yes |
| Webhook alerts | — | — | yes |
About the per-tier managed-model whitelist
Each plan's managed mode limits the models we forward to a cost-bounded whitelist (see table above). Any other model returns 403 plan_limit_error before the upstream call is made. This is a hard cap to bound our upstream cost exposure per plan.
If you set workspace.default_model to a non-whitelisted model and then omit modelfrom your request payload, you'll still get the 403 — with a message pointing you to the dashboard so you can fix the default.
Need a different model? On Pro and Scale, enable BYOK and the per-tier whitelist no longer applies — your provider key, your bill, any model your provider supports.
Managed monthly usage allowance
In managed mode we pay the upstream provider on your behalf, so each plan's monthly allowance is bounded by both the request count above anda fair-use upstream-cost budget — whichever you reach first. Most workloads hit the request count; very large prompts/outputs can reach the cost budget sooner. When you hit either limit, requests return 429 with a message to upgrade or switch to BYOK.
BYOK has no such budget — you pay your provider directly, so on Pro/Scale you can run unlimited managed-feature traffic (cache, dashboard, routing) on your own key. If predictable high volume matters, BYOK is the path.
About basic vs priority routing
Basic routing (Free): short non-code prompts go to gpt-4o-mini, everything else uses the workspace default model. Priority routing (Pro/Scale) adds more granular signals (complexity heuristics, length thresholds). Scale also unlocks custom routing rules via POST /v1/workspace/routing-rules — full reference in the dashboard.