Rate Limits – Algolia

Rate limits are applied and designed to safeguard the performance and stability of customer facing production services.

The rate limits cover indexing (such as records, synonyms, and rules), operations (setSettings, moveIndex), neural-specific actions, and delete by query. NeuralSearch also processes and stores additional neural embeddings and model data, where NeuralSearch is used the system enforces stricter protective thresholds to ensure consistent performance.

Where rate limits apply

Rate limits are enforced in four key areas:

Records, Synonyms, Rules, and Recommend Rules
Index-Level Operations (e.g., setSettings, moveIndex, …)
Neural Index Requests
Delete By Query

When exceeded, requests fail with a HTTP stats code 429 <too many requests>. The error message will include the following information maximum <Rate> API requests per <XXX> per application.

Expected behavior: success vs. failure

Successful requests will return a taskID within the response body. This can also be used to wait for task completion if required.
Unsuccessful requests will throw an error for rate limited requests the http status code will be 429.

Implementation note: treat any 429 errors as a signal to slow down the number of requests being sent. The message will include the maximum request count and time window; use this information to guide any delay and retry logic when implementing operational requests.

Best practices to avoid rate limits

Batch creating and updating records

Prefer batching over sending single-object updates.
Use the SDK’s batch method over saveObjects:
- Save objects: https://www.algolia.com/doc/libraries/sdk/methods/search/save-objects sends 1 object per request
- Batch operations: https://www.algolia.com/doc/libraries/sdk/methods/search/batch allows sending 1,000 objects per request
Aim for a batch size of about 10 MB, sending between 1,000 and 10,000 records per batch request.

Minimize index-level churn

Index-level operations (e.g., setSettings, moveIndex) should be made infrequently. The allowed limits are high enough for normal operation. If you encounter these limits, it usually indicates an implementation issue (e.g., repeatedly applying the same settings, or moving/creating indexes unnecessary). Refactor to reduce the frequency of operations.

Use idempotency where available

When possible, structure your pipelines so retries won’t create duplicate effects. Avoid repeatedly toggling the same index configuration, sending the same data or reindexing when unnecessary.

Schedule and pace heavy jobs

Stagger large reindexing jobs or deleteByQuery runs to avoid bursts.
For Neural Reindex, plan windows to avoid overlapping intensive tasks, and only reindex when needed.