Whenever a server starts being overloaded, Algolia first delays then rejects indexing operations if necessary. These proactive measures aim at avoiding downtime. We call this action the rate limit.
The introduced delay (or throttling) exists to avoid reaching the the rejection limit. We keep on throttling or rejecting operations until the server comes back to a more manageable state.
It’s important to note that:
- The rate limit doesn’t slow down or impact search operations in any way. If a server gets overloaded with search requests, we rely on degraded queries to handle them.
- Applications on our current pricing model have a limit of 10,000 indexing operations per unit.
- When rejecting an indexing operation, the API returns an HTTP 429 error with a message specifying the exact reason (too many jobs, job queue too large, old jobs on the queue, disk almost full).
- This limit impacts the following indexing methods:
When do we trigger the rate limit?
We designed the Algolia servers to contain large amounts of data and to perform fast search and indexing operations. Therefore, it’s unlikely (but not impossible) to reach the natural limits of a server.
To avoid downtime or delay, every Algolia server has an internal “rate limit” mechanism that is designed to stop overloading the server with too many costly indexing operations.
Overloading happens when the server can no longer handle indexing operations within a reasonable time frame. We monitor our servers to avoid the following scenarios:
- A client’s overall application size (the total of all index sizes) becomes too large.
- Old requests remain unprocessed, indicating a backlog of indexing requests.
- The indexing queue has too many unprocessed requests, or the total size of queued requests is too big.
Regarding these indexing operations:
- When you have more than 100 pending requests, we throttle your requests.
- When you have more than 5,000 pending requests, we’ll ignore any new request and return an HTTP 429 error.
The specific limits for application size, request age, and indexing queue size depend on your plan. If your plan includes dedicated infrastructure, you can contact your success team to discuss the rate limits on your cluster.
What happens when the rate limit is reached?
If any of these scenarios occur, we start slowing down incoming indexing operations, and the client needs to wait before sending new requests. This delay is mostly transparent.
Throttling of operations varies depending on their type. Index operations (
deleteBy) are subject to stricter rules than record operations (
If the server continues to be overloaded despite throttling, it starts to reject indexing requests as they come in, returning an HTTP 429 error. The error message gives more details about the cause of rejection. The
Rate limit reached for file size error message indicates that your overall application size is too large. In this case, you need to remove some records or indices in order to continue sending updates. When this rate limit occurs, delete operations are still permitted. For all other limits, it’s best to wait for the servers to catch up before sending any further indexing requests.