Previously when conversational data was ingested into Kapiche with a separate record for each utterance (with a conversation typically consisting of many utterances), the dashboard could only present results on a per-utterance basis. Often though, conversation-level statistics will give the clearest picture of what customers are talking about. Hence we created the 'Group By' feature on Kapiche Dashboards, which allows you to effortlessly obtain conversation-level statistics.
How does it work?
Let's imagine we have a dataset made up of Support conversations for a Retail brand. Since a conversation typically consists of multiple back-and-forths between customer and agent, the dataset is structured such that an individual turn (utterance) represents one record (e.g. a row in a CSV or Excel file). Storing the data this way gives us a fine-grained representation with the maximum amount of information retained, such as time and speaker for each utterance.
However, when attempting to analyse the data as is, we'll find that longer conversations that contain more utterances than average, skew the statistical picture. Simply, customers that write or speak more have a larger impact on the statistics than others.
What we really want is to maintain maximum data integrity by ingesting the data at the utterance level, but group the utterances up into conversations when analysing data in the Dashboard in order to give us a more balanced picture of what customers are talking about.
This is exactly what the Group By function is intended for. You can select a Categorical Field in the data to group records by. If we chose Conversation ID for example, it would mean that the statistics are calculated such that each unique conversation only counts once, regardless of the number of utterances it contains.
To illustrate, let's look at the before-and-after of the top-level statistics on the Dashboard.
Before using the Group By, there are 9,266 records:
After selecting Group By on the Conversation ID field, we now show 1,839 groups instead of records:
This tells us that there are approximately 5 utterances (records) per conversation (group). Before applying the Group By, if a single conversation mentioned "price" in 20 different utterances, it would contribute 20/9,266 = 0.2% to the overall frequency of "price", whereas, after the Group By, that conversation would only contribute 1/1,839 = 0.05%. More importantly, if another conversation mentioned "price" a different number of times, the first percentage would change, but the second wouldn't, showing how each conversation counts equally in the statistical picture.
Note: Sentiment Calculation
Kapiche calculates sentiment at the verbatim level. When grouping multiple utterances (verbatims) together, Kapiche needs to aggregate the individual sentiment values to produce a single label. This is currently done by selecting the most frequent sentiment label across the utterances in a group. For example, if a group contains 3 neutral, 2 positive, and 1 negative verbatims, the group will count as neutral.
The versatility of the 'Group By' feature goes beyond conversations. Users can leverage any Categorical Field defined within the project schema to group the results, enabling a comprehensive view of the data from various angles. Whether you need to analyze results by customer segment, product category, or any other categorical attribute, Kapiche Dashboards empower you to explore the data effortlessly.
At Kapiche, our relentless pursuit of providing intuitive and powerful analytics solutions drives our innovation. With this feature, we continue to redefine the boundaries of what's possible, ensuring that you have access to the most comprehensive statistical data whenever you need it.
If you have any questions about the Group By feature (or if you just need some help!) you can get in touch with us any time by hitting the blue chat button to your right 👉