← Back to ProxyLLM

Rate limits

ProxyLLM enforces three independent rate-limit windows. Whichever triggers first returns 429.

Windows

WindowFreeProScale
Per minute101001,000
Per day5005,00050,000
Per month10,00050,000500,000

Response headers

  • x-proxyllm-rate-limit— the limit value of the window that blocked you (e.g. 10 for per-minute on Free).
  • x-proxyllm-rate-remaining— remaining calls in the relevant window. 0 means the next request will be blocked.
  • x-proxyllm-rate-window— only set on 429 responses. One of monthly, minute, or daily.
  • retry-after— standard HTTP header on per-minute and per-day 429s. Integer seconds until the window resets. The monthly window doesn't set this header because the wait can be weeks.

429 response shape

HTTP/2 429
retry-after: 42
x-proxyllm-rate-limit: 10
x-proxyllm-rate-remaining: 0
x-proxyllm-rate-window: minute
Content-Type: application/json

{
  "error": {
    "message": "Per-minute request limit reached (10 requests on free plan). Upgrade your plan or wait for the window to reset.",
    "type": "rate_limit_error"
  }
}

Recommended retry pattern

Read retry-after and back off:

if (res.status === 429) {
  const retryAfter = parseInt(res.headers.get("retry-after") ?? "60", 10);
  await new Promise(r => setTimeout(r, retryAfter * 1000));
  return retry();
}

For the monthly window (where retry-afterisn't set), either upgrade the plan or wait for the calendar-month rollover. Background jobs should check x-proxyllm-rate-remaining proactively to avoid hitting the cap mid-batch.

Burst behavior

Per-minute and per-day buckets are fixed windows aligned to wall-clock UTC. At a boundary you can briefly burst up to 2× the limit (10 reqs at 59.5s + 10 reqs at 00.0s). The monthly window has the same property at the start of each calendar month.