Why are my facet and hit counts/number of results not accurate? – Algolia

Algolia returns the total number of hits and facet values with every set of results. Sometimes, to limit the impact of this heavy operation on search performance, these values aren’t exhaustive.

This means that the number of hits returned for a query, or the count shown for a facet value, may sometimes be an approximation rather than an exact total. In some cases, different parts of the search operation can time out independently. For example, a query that only matches results through typo tolerance may time out before returning all expected hits, or may return no hits if the engine can’t complete that part of the search within the required time.

You may also see differences between a facet count and the number of results returned after applying that facet. For example, a search may show that the facet value color:red has 20 matching records. However, after filtering on color:red, the filtered search may return fewer results if the result set is large or the index configuration is complex. This can happen because the initial facet count and the filtered search are computed as separate operations, and each may be affected by performance optimizations or timeouts.

Non-exhaustive results are not an issue in most cases. When hit and facet counts reach such a large amount that this becomes a problem, users aren’t looking for exact numbers, but for a general idea (e.g., 1,000 rather than 1,042).

However, there are ways to limit the effects of non-exhaustive facet counts.

Validating the issue

You can check your network response to see if the exhaustive.nbHits is false to validate that the search for hits was not exhaustive (in addition to other aspects of the response-- see the full list of fields here)

The problem

Non-exhaustivity is a performance optimization issue, shared by all search engines.

First, let’s look at what’s going on when you get approximate hit and facet counts. When you type a query into Algolia, we compute a list of results that you can paginate. Then, based on this list of all possible results, we start computing hit and facet value counts one by one. We try to compute everything in advance, but this is not always possible, like when the count entirely depends on the user’s query. At a certain point the Algolia engine will stop counting and makes approximations on the rest of the dataset to keep your searches fast.

In most cases, this isn’t an issue. You might show rounded counts, like ~200, or hide the count altogether. However, we understand that this isn’t ideal for every use case.

Mitigation steps

There’s currently no way to bypass this behavior. Ultimately, working on the performance of your index is the best option, because a high performing index means there’s more time for the engine to compute the correct facet counts. However, note that server resources are also a factor here, and the engine will favor speed over exhaustivity.

Here are a few suggestions to improve the performance of your queries, and therefore limit the cases where approximations are used:

Reduce the index size as much as possible.
Remove unnecessary records, and attributes that aren’t useful for the search experience.
Keep the list of searchable attributes as small as possible. Instead of having all attributes searchable (the default behavior), manually define a list of searchableAttributes to only search in relevant attributes. Also remember that the longer the attribute, the costlier it is to search into it.
Make sure that facet attributes that you only use for filtering are set as filterOnly when declaring your attributesForFaceting.
Make sure hitsPerPage is set to a reasonable number.
Reduce the number of attributes for faceting. We recommend keeping this list as small as possible, with only the strict minimum required for your search UI.

Also consider:

The response when performing a search contains the exhaustive object which contains the facetsCount and nbHits fields. You can use them to manage non-exhaustivity on the front end, indicating to the user that the values are approximations.
Reduce the number of filters or precompute some filters in your records when possible.
Reduce the usage of CPU-intensive features (facets, many filters, geo, grouping, or optionalFilters) as much as possible.
Explicitly set attributesToRetrieve, attributesToHighlight and attributesToSnippet (if applicable). You can even set some of them at query time.
Consider entirely removing attributes with massive content (such as descriptions) from attributesToHighlight and attributesToSnippet.
When a query times out and optional filters are used, the filters become mandatory and the timeout is increased by 10%, which allows to surface records that were buried deeper in the index.

There are several scenarios that could lead to incorrect or inconsistent facet counts. If this article about exhaustivity doesn't account for the behavior you're observing, please refer to this guide:

Facet and hit count issues checklist

Validating the issue

The problem

Mitigation steps

Related articles