In addition to any plan limits, the Crawler is subject to the following technical limits:
Data volume limitations
Size per document | 10 MB |
Crawling refresh/re-crawl per day | Manual: 100 Automatic: once |
Crawler console limitations
Number of statistics retrieved from analytics tool | Only top 100K pages |
Number of CSV files/lines imported as an external sources (per crawling operation) |
5 million |
Data retrieval frequency
The minimum time between data updates (crawls) is 24 hours. Real-time indexing isn’t guaranteed.
Application restrictions
The Crawler needs to access your website to index data to Algolia. You need to ensure it’s granted the appropriate access rights (such as allow lists and authorizations).
Metadata limitations
As the Crawler is limited by the data it can access, you may need to inject additional metrics besides what’s currently available on your website to tailor the search experience to your business needs.
Google Analytics limitations
The Crawler is limited to 10,000 Google Analytics API requests per day in compliance with Google Analytics restrictions.