The problem of bots is a common threat to the internet, but there is no "right way" to fix it. In Algolia, you can generate a new Search API Key with rate limits:
This will limit the number of requests per IP per hour. The limit is up to you; we generally recommend starting with a higher number (in order to avoid limiting real users) and reduce it gradually based on the usage/fake bots.
As Algolia’s architecture is distributed over three nodes in a cluster, it is recommended to divide any limit you are considering applying by three. Each of the nodes will apply its own rate limiting based on the number of requests you choose. This does not apply if you are on Dynamically Scaling Infrastructure.
If you are unsure of your Infrastructure, follow the instructions in How can I monitor my Algolia servers, clusters, and DSNs?. If your cluster starts with M, you are on Dynamically Scaling Infrastructure.
To fully resolve the issue, the searches themselves need to be prevented. While you cannot block IP addresses with Algolia, your web host/infrastructure provider is a good point of contact for advice on mitigating bot searches on your website. You can also investigate blocking IP addresses from your own site search.
As bots evolve with the advent of AI and machine learning, blocking via IP may no longer be sufficient. You may need to employ heuristic detection as there are many ways bot providers can bypass traditional checks:
- Pools of rotating IP addresses to avoid IP blocking
- Region specific pools to avoid geo restricting
- Spoofing browser status to avoid headless browser usage (common in bots)
The best way to address these sophisticated approaches is to apply specialised bot protection services. These include fingerprint.com, and many Web Application Firewall (WAF) providers or Content Delivery Network (CDN) providers.
Here are some other resources that may be helpful:
- Cloudflare has good measures against this:
- You could check your robots.txt to ensure it contains the appropriate configuration to allow/deny search engine crawling
- The Algolia Academy features a lesson on bot prevention - https://academy.algolia.com/training/01973179-606b-7464-a7ae-574450f58280/overview
- One of our team did a presentation on the topic of Bots at Algolia Devcon - Here is the link to the recording: https://www.youtube.com/watch?v=hYldeR2mPTs. It has more details on the cause of the issue and some of the techniques you can use to mitigate their impact.
- Another video from our team is available here: https://youtu.be/yLBRJIzr8eE?si=R6pKGfC2V4OAaZNK&t=161