The reasons why a page within a domain wasn’t crawled but others were can vary. Check if:
- The Crawler has reached the
maxUrlsnumber and stopped before reaching the specific URL. - The crawling process has finished. Crawling a big site can take time: check the progress from the Crawler page.
- The page is linked from the rest of your site. Ensure you can trace a path from the
startUrlsto the missing page. It should either be reachable from these starting points or listed in your sitemap. If not, add the missing page as a start URL. - You’ve given the crawler the correct path. Ensure the page matches one of the
pathsToMatchyou’ve told the crawler to look for. - You have instructed the crawler to ignore the page. If the page matches an
exclusionPatterns, the crawler ignores it. - The page requires a login. If so, add the
loginparameter to your configuration. - The page is rendered using JavaScript, you may need to set
renderJavaScripttotruein your configuration (note: this makes the crawling process slower).
If none of these solve your problem, an error may have happened while crawling the page. Please check your logs using the Monitoring and URL Inspector tabs.
You can also use the URL tester in the Editor tab of the Admin to get details on why a URL was skipped / ignored.
You can find a complete troubleshooting guide on our official documentation https://www.algolia.com/doc/tools/crawler/troubleshooting/crawl-status