Algolia returns the total number of hits and facet values with every set of results. Sometimes, to limit the impact of this heavy operation on search performance, these values aren’t exhaustive.
This means that the number of hits returned for a query, or the count shown for a facet value, may sometimes be an approximation rather than an exact total. In some cases, different parts of the search operation can time out independently. For example, a query that only matches results through typo tolerance may time out before returning all expected hits, or may return no hits if the engine can’t complete that part of the search within the required time.
Non-exhaustive results are not an issue in most cases. When hit and facet counts reach such a large amount that this becomes a problem, users aren’t looking for exact numbers, but for a general idea (e.g., 1,000 rather than 1,042).
There are ways to limit the effects of non-exhaustive facet counts.
The problem
Non-exhaustivity is a performance optimization issue, shared by all search engines.
First, let’s look at what’s going on when you get approximate hit and facet counts. When you type a query into Algolia, we compute a list of results that you can paginate. Then, based on this list of all possible results, we start computing hit and facet value counts one by one. We try to compute everything in advance, but this is not always possible, like when the count entirely depends on the user’s query. At a certain point the Algolia engine will stop counting and makes approximations on the rest of the dataset to keep your searches fast.
In most cases, this isn’t an issue. You might show rounded counts, like ~200, or hide the count altogether. However, we understand that this isn’t ideal for every use case.
Mitigation steps
There’s currently no way to bypass this behavior. Ultimately, working on the performance of your index is the best option, because a high performing index means there’s more time for the engine to compute the correct facet counts. However, note that server resources are also a factor here, and the engine will favor speed over exhaustivity.
Here are a few suggestions to improve the performance of your queries, and therefore limit the cases where approximations are used:
- Reduce the index size as much as possible.
- Remove unnecessary records, and attributes that aren’t useful for the search experience.
- Keep the list of searchable attributes as small as possible. Instead of having all attributes searchable (the default behavior), manually define a list of
searchableAttributesto only search in relevant attributes. Also remember that the longer the attribute, the costlier it is to search into it. - Make sure that facet attributes that you only use for filtering are set as
filterOnlywhen declaring yourattributesForFaceting. - Make sure
hitsPerPageis set to a reasonable number. - Reduce the number of attributes for faceting. We recommend keeping this list as small as possible, with only the strict minimum required for your search UI.
Also consider:
- The response when performing a
searchcontains theexhaustiveobject which contains thefacetsCountandnbHitsfields. You can use them to manage non-exhaustivity on the front end, indicating to the user that the values are approximations. - Reduce the number of filters or precompute some filters in your records when possible.
- Reduce the usage of CPU-intensive features (facets, many filters, geo, grouping, or
optionalFilters) as much as possible. - Explicitly set
attributesToRetrieve,attributesToHighlightandattributesToSnippet(if applicable). You can even set some of them at query time. - Consider entirely removing attributes with massive content (such as descriptions) from
attributesToHighlightandattributesToSnippet. - When a query times out and optional filters are used, the filters become mandatory and the timeout is increased by 10%, which allows to surface records that were buried deeper in the index.
There are several scenarios that could lead to incorrect or inconsistent facet counts. If this article about exhaustivity doesn't account for the behavior you're observing, please check our other guides on this topic: