Creating a Project & Uploading Data

Now that you've formatted your data, it's time to load it into Kapiche.

Josh Winters avatar
Written by Josh Winters
Updated over a week ago

Getting Started

On the home screen of your Site (click the Kapiche logo in the top left to get to your Site home), click the "New Project" button to create a new Project.

Creating a Project will begin a simple 4-step workflow which will go through the process of importing data, setting data types, naming the project, and confirming Data Unit usage.


The first step is to choose what type of data your Project will use - data from files you upload or data via an integration. This article will cover data uploaded. For integrations please see the dedicated integrations help article.

Click "Upload data from .csv or xlsx file" and choose a file. Your file will be uploaded and compared to a best practice checklist:

  • Field labels indicated by column headers

  • Column headers not empty

  • Sufficient text data (over 2,000 unique words detected or at least 250 rows). Kapiche automatically counts the number of different words in your file to estimate whether there is suffient text data to form a meaningful language model.

These checks are just a guideline; you can proceed whether or not your file adheres to them or not.

Your file will also be checked for the number of rows/records, and if your Site has enough Data Units to complete the Project.


Every column in your data will be mapped to a 'data type'. These are used by Kapiche to process the data in each column appropriately.

Kapiche automatically detects which data types might be best suited to each column, but it is best practice to quickly check if they make sense.

List of data types and definitions:

  • Text/Verbatim: Processed as text data to be used in the language model. Also used to display text excerpts/verbatims. Map your free-text/unstructured data to this data type.

  • Numerical: Processed as a numerical value. Numerical values can use mathematical functions in a query, such as "greater than" and "less than". Map any numerical structured data to this data type.

  • Date & Date Time: Processed as date formats to use as filters and segments. Also used to map data in timeline and trendline charts. Map any date fields in your data to this data type.

  • Category: Processed as a value to be used as a segment. Map any structured data that doesn't fall under the other categories to this type.

  • NPS (0-10): Processed as a special value which is used to generate three categories (9-10 are labelled as Promoters, 7-8 are labelled as Passives, 0-6 are labelled as Detractors). Also used to calculate NPS and show NPS statistics. Map your respondent's recommendation score to this data type.

  • Ignore: Values in this column will be ignored and not included in this Project. Map anything you don't want to be included to this data type.

Once all columns are set to the correct data type, click Proceed.


The final step before confirmation is where you'll name the Project and configure its settings.

PII Redaction

When this is enabled Kapiche will automatically detect personally identifiable information (PII) in your data, such as full names, phone numbers, addresses, etc, and replace it with a generic placeholder so that we are not storing sensitive information.

Note: While we make our best efforts to ensure maximum efficacy no solution in this space is 100% accurate.

Enable Sentiment Identification

When this is enabled Kapiche will automatically detect the sentiment associated with your text data and make it available for analysis in the product. This feature is currently only supported for English language data.

Start Day of Week On

By default, Kapiche views a reporting week as Sunday to Monday. This option allows you to change which day a week should begin on for the purposes of filtering and display.

Skip Bad or Missing Dates

This feature allows date fields with some invalid values to be successfully loaded and analysed. This is not enabled by default because often we find that failing on invalid dates provides us an opportunity to check if they can be remediated at the source. If not, then we can enable this setting in order to make the best use of the data we have.


The final step is to confirm Data Unit usage. Once you click "Create Project", the Data Units will be subtracted from your Site's balance and the Project will begin processing. See our article on Data Units for more information.

Now that your Project is created, you can either add more data to the Project or run an analysis.

Did this answer your question?