The operators section in the Query Builder provides specialised controls and filters for use in queries.
Word Count Operator
The word count operator gives you the power to filter the unstructured text based on how many words Kapiche has detected in a particular verbatim or sentence. This operator behaves similarly to a numerical field, allowing us to use the standard set of filters available for numerical data.
You can choose one of the preset options of values to use from or use any positive whole number by typing it in the value field.
This operator also respects the verbatim/sentence level setting. For example, if we use
Word Count < 3, on a sentence level, we can expect the following results:
The same search result on the verbatim level will apply the filter to the entire text of the verbatim.
It's important to note that the word count operator acts upon words detected by Kapiche and not every sequence of characters will be detected as a word. In the first verbatim above, even though there are 3 distinct character sequences, these are not detected as words, so the filter matches the verbatim.
Email addresses such as
firstname.lastname@example.org or even redacted ones,
********@gmail.com are valid tokens that contribute to word count.