If you've already read our documentation on Indexing Long Documents you might be wondering how these separate de-duplicated records are linked back to the original document.
The simplest implementation is to have all the records that represent a single document contain a shared id attribute (e.g. article_id) and the same non hashed URL path directing them to the top of the page. For example the following documentation page could be broken into 5 different records all with the same base URL that would direct the user to the top of the page.
https://www.algolia.com/doc/guides/getting-started/what-is-algolia/.
Alternatively, you could give each record a separate hashed URL path. This is only an option if your site already has hashed URL paths for your article/document pages. More on hashed vs non-hashed URL paths can be found in this article.
When using the hashed approach, each record links your users to the specific spot in the article where their search term appears. The example records might look something like this.
{
"objectID": "5477500"
"article_id": 158,
"article_title": "What Is Algolia?",
"section_title": "What does Algolia do?"
"url": "https://www.algolia.com/doc/guides/getting-started/what-is-algolia/#what-does-algolia-do",
"article_body": "Algolia consists of two parts: search implementation and search analytics. The implementation tools make it easier for your developers to create and maintain great search experiences for your users. The analytics tools enable your business teams to analyze the impact of those experiences and refine them, so they can directly address your evolving business objectives.",
},
{
"objectID": "3385602"
"article_id": 158,
"article_title": "What Is Algolia?",
"section_title": "Search as a feedback loop"
"url": "https://www.algolia.com/doc/guides/getting-started/what-is-algolia/#search-as-a-feedback-loop",
"article_body": "Search has the potential to not only help your business, but also shape it.To be clear, search doesn’t know the direction that your business should take. It can help you gather information on what your customers want, so you can better align your business with your users. Imagine having a way of asking every single customer who walked into a physical store with all your products, “what are you looking for?” and recording their responses. This would give you a sense of what they’re actually looking for, what they’re not, and how both of these overlap with what you are currently providing. Algolia lets you immediately start collecting this information on your users.",
}
The records they share the same article_id and article_title so they can be de-duplicated by ID and no matter which record is most relevant to their search, the title will be the same. However, notice that the section_title, url, and article_body are all different.
This means you can show your users the article_title, the section the URL will take them to within that article, and highlight the text in the article_body that matches their query. This approach guides your users to the exact spot in the article they were looking for and informs them exactly where they will end up if they click the result.
The example URL's are all real hashed URL paths that will bring you to the exact spot in our "What is Algolia" documentation page. If your site already has hashed URL path's this can be a powerful way to help users navigate your content.