Analysis

An analysis is the result of running Project Data through the Kapiche Analytics engine with specified settings. Analysis settings include things such as the date range and text fields being analyzed in a particular analysis. An analysis will always use all available data within the Project it is being run in.

Correlation

Correlations are the likelihood of a query to have a relationship to a concept or segment. In other words, how likely or unlikely that a concept or segment is to appear in the query.

Data Coverage

Data Coverage refers to the percentage of total records that an element (query, concept, term, segment, etc) appears in. A concept, term or phrase is counted only once per text excerpt even if it appears multiple times within the excerpt.

Data Unit

Data Units are the unit of usage within Kapiche. 1 row of data in a spreadsheet equates to 1 Data Unit.

Dashboard

A dashboard is a screen with information populated by the Saved Queries a user has selected.

Concept

A Concept is a meaningful term which occurs at a relatively substantial frequency in the dataset.

Field

A field, or variable, is the label given to a column header within a data file. Examples of fields include: Age, Date, Gender, Satisfaction_Score. Within a field can be multiple segments, or values of a field.

Frequency

Frequency is the number of counts something has occured. A concept, term, or phrase is counted only once per text excerpt even if it appears multiple times within the excerpt.

Influence

Influence is a measurement of how much our observation of co-occurrence between two concepts/segments exceeds expectations.

N-Gram

An n-gram, or phrase, is a string of two or more terms. N-grams that appear in an analysis will be the phrases that appear multiplpe times in a dataset (like everything the Kapiche analytics engine does, it is based on the data and not predetermined). Examples of n-grams: "loading screen", "queue time", "mobile app".

Normalization

When data is normalized, it eliminates the units of measurement in order to make variables more comparable to each other. When the Normalize option is active on certain visualizations, the data is graphed in terms of its relative frequency at the data point. Since each data point may have a different total number of records, this allows you to view the underlying frequency trends regardless of the amount of data collected at each point.

Project

A project is a container for related data and its respective analyses. Each Project has its own user list, known as Project Members.

Project Member

A user of the Site with access to a specific Project.

Project Data

Data that has been added to a Project by uploading files or through an integration. Project data must follow the same column structure, as analyses being run always use all Project data. Mixing unrelated data files in a Project will result in analyses being uninterpretable.

Raw Freq.

Short for "Raw Frequency"; the raw occurence values are shown rather than a percentage.

Record/Row

A record, or row, is one entry in a data source.

Segment

A segment is a value of a field. For example, "Male" would be a segment of the field "Gender".

Sentiment

Sentiment is an indication of the emotion within the unstructured data. See the article on Sentiment for more information.

Stopword

Stopwords are words that are filtered out from becoming Terms in an analysis. These are words that are usually meaningless, and would add no value to the language model. Examples of stopwords include: "a", "the", "I".

Structured Data

Structured data refers to any data that sits within a fixed field in a data file. Examples of structured data include: Gender, Date, and Location. Any text that is not freely written and is instead more or less 'chosen' sits under this category of data. 

Text Excerpt

A text excerpt is the data from a text field in one record of a data source. A record may include multiple text excerpts if the file has more than one text field. In a customer survey context you can think of a text excerpt as the written response of a customer to a question in the survey.

Unstructured Data

Unstructured data is data that does not have a predefined model; data that has no fixed selection. In the context of Kapiche: unstructured data is free-text data.

User

A user is any account with some form of access to your Kapiche Site. This is different to a Project member, which is specific to a Project. All Projects members are users on a Site, but not all users are members of every Project.

 

Did this answer your question?