You should aim to have no more then 1000 words per attribute.
Position information will not be returned after the 1000th word and this also means that for some languages (such as Japanese) which rely on us sequencing expressions matches will not be returned after the 1000th word at all.
What counts as words depends on the language:
- In general, we split on separators (such as -)
- Our decompounding may split a word in multiple in decompounded languages
- For Japanese, we use Kuromoji to split a string into "words" of grouped characters
- For some other languages (e.g. Chinese & Korean), we index characters as words
- In arrays, each array entry gets a position of +8 compared to the last position of the previous entry.
- Stop words do count towards the limit
Algolia manages records best when they are broken down into smaller chunks of data. You can read our guide on how to index long documents.
You can read more about sequence expressions limitations in our docs.