Rate limits
ProxyLLM enforces three independent rate-limit windows. Whichever triggers first returns 429.
Windows
| Window | Free | Pro | Scale |
|---|---|---|---|
| Per minute | 10 | 100 | 1,000 |
| Per day | 500 | 5,000 | 50,000 |
| Per month | 10,000 | 50,000 | 500,000 |
Response headers
x-proxyllm-rate-limit— the limit value of the window that blocked you (e.g.10for per-minute on Free).x-proxyllm-rate-remaining— remaining calls in the relevant window.0means the next request will be blocked.x-proxyllm-rate-window— only set on 429 responses. One ofmonthly,minute, ordaily.retry-after— standard HTTP header on per-minute and per-day 429s. Integer seconds until the window resets. The monthly window doesn't set this header because the wait can be weeks.
429 response shape
HTTP/2 429
retry-after: 42
x-proxyllm-rate-limit: 10
x-proxyllm-rate-remaining: 0
x-proxyllm-rate-window: minute
Content-Type: application/json
{
"error": {
"message": "Per-minute request limit reached (10 requests on free plan). Upgrade your plan or wait for the window to reset.",
"type": "rate_limit_error"
}
}Recommended retry pattern
Read retry-after and back off:
if (res.status === 429) {
const retryAfter = parseInt(res.headers.get("retry-after") ?? "60", 10);
await new Promise(r => setTimeout(r, retryAfter * 1000));
return retry();
}For the monthly window (where retry-afterisn't set), either upgrade the plan or wait for the calendar-month rollover. Background jobs should check x-proxyllm-rate-remaining proactively to avoid hitting the cap mid-batch.
Burst behavior
Per-minute and per-day buckets are fixed windows aligned to wall-clock UTC. At a boundary you can briefly burst up to 2× the limit (10 reqs at 59.5s + 10 reqs at 00.0s). The monthly window has the same property at the start of each calendar month.