Today we're going to walk you through everything you need to know about how Kapiche cleans any data that is mapped to the NUMERICAL and NPS field types during the Schema Mapping stages.

In this guide:

Why Kapiche needs to clean data

We've noticed a number of customers have been experiencing some issues with the way their survey providers export certain types of numerical responses like NPS & CSAT where records would contain both Numerical and Text. For example:

  • Disagree0

  • 10Agree

  • 0 - Extremely Disagree

  • 10 Excellent

  • Completely Satisfied10

When this happens, Kapiche doesn't know whether to process these records as Numerical Data or Text data which results in a processing error - ultimately causing Kapiche to ignore these records during the upload stage.

To help solve this problem we have a feature that allows Kapiche to automatically detect and clean any data that is set to the NUMERICAL and NPS field types during the Schema Mapping stages.

This means that if Kapiche detects that a record mapped to the NUMERICAL or NPS field type contains text (like one of the examples above/below) we will automatically remove the text and process the data as intended/expected!

How does the cleaning of the data work?

We clean a numerical field by removing all non-digits from the beginning and from the end of the record/data which includes all punctuation and symbols (eg. $ % . - etc) as well as spaces.

To accommodate negative numbers, a hyphen (AKA minus sign) will be included if:

  • A hyphen was immediately before the first digit found

    • and it was either the very first character in the pre-cleaned data or it had a space in front of it

Types of data Kapiche cleans

Here are some more specific examples of the types of data Kapiche will automatically clean for you (providing they're set to the NPS or NUMERICAL field type during the Schema Mapping Stage):

Data Example

Output

Cleanable?

Reason

Disagree0

0

:white_check_mark:

Removed the text before the 0

10Agree

10

:white_check_mark:

Removed the text after the 0

45

45

:white_check_mark:

No cleaning needed

15.3

15.3

:white_check_mark:

No cleaning needed (Kapiche handles decimals)

elephant

:red_circle:

Not cleaned - no digits detected

there are 5 cats and 9 dogs

5 cats and 9

:red_circle:

Not cleaned - multiple digits detected

There are 4 lights!

4

:white_check_mark:

Removed the text before the 4 and after the 4

-45

-45

:white_check_mark:

Minus number detected as the - was attached to the number without a space before or after it

Age-45

45

:white_check_mark:

The hyphen was detected as part of the text 'Age' because it was attached to the text without a space before or after it.

45 years-old

45

:white_check_mark:

The hyphen was detected as part of the text 'years-old' which was then removed because the text appeared at the end of the data and was separated from the digit with a space.

young: 1 - 12 years-old

1 - 12

:red_circle:

This data would be ignored / failed as it multiple numbers were found.

young: 1-12 years-old

1-12

:red_circle:

This data would be ignored / failed as it multiple numbers were found.

young - 10 or less

10

:white_check_mark:

Removed the text and symbols before and after the numeric data.

unlikely -10 or less

-10

:white_check_mark:

Minus number detected as the - was attached only to the number, everything else was removed.

unlikely-1 or less

1

:white_check_mark:

The hyphen was detected as part of the text 'Unlikely' as there wasn't a space after the text and before the 1.


Questions? 🤔

If you have any questions about Numerical & NPS Field Cleaning (or need some help!) you can get in touch with us any time by hitting the blue chat button to your right 👉

Did this answer your question?