← Back to ProxyLLM

Plan limits

Full matrix of what each plan ships. Cross-reference with the pricing page for current prices.

FeatureFreeProScale
Monthly requests10,00050,000500,000
Per-minute burst101001,000
Per-day burst5005,00050,000
Allowed models (managed)gpt-4o-minigpt-4o-minigpt-4o-mini
Allowed models (BYOK)any model on your provider keyany model on your provider key
API keys1 (primary)5unlimited
Log retention7 days30 days90 days
Cache TTL24h72h168h (7d)
Semantic vector pool1,00010,000unlimited
Smart routingbasicprioritypriority + custom rules
Advanced analyticsyesyes
Webhook alertsyes

About the per-tier managed-model whitelist

Each plan's managed mode limits the models we forward to a cost-bounded whitelist (see table above). Any other model returns 403 plan_limit_error before the upstream call is made. This is a hard cap to bound our upstream cost exposure per plan.

If you set workspace.default_model to a non-whitelisted model and then omit modelfrom your request payload, you'll still get the 403 — with a message pointing you to the dashboard so you can fix the default.

Need a different model? On Pro and Scale, enable BYOK and the per-tier whitelist no longer applies — your provider key, your bill, any model your provider supports.

Managed monthly usage allowance

In managed mode we pay the upstream provider on your behalf, so each plan's monthly allowance is bounded by both the request count above anda fair-use upstream-cost budget — whichever you reach first. Most workloads hit the request count; very large prompts/outputs can reach the cost budget sooner. When you hit either limit, requests return 429 with a message to upgrade or switch to BYOK.

BYOK has no such budget — you pay your provider directly, so on Pro/Scale you can run unlimited managed-feature traffic (cache, dashboard, routing) on your own key. If predictable high volume matters, BYOK is the path.

About basic vs priority routing

Basic routing (Free): short non-code prompts go to gpt-4o-mini, everything else uses the workspace default model. Priority routing (Pro/Scale) adds more granular signals (complexity heuristics, length thresholds). Scale also unlocks custom routing rules via POST /v1/workspace/routing-rules — full reference in the dashboard.