Skip to main content

Uploading Call Recordings for Transcription

How to manually upload calls into Kapiche

C
Written by Cameron Parry
Updated this week

Kapiche can automatically transcribe audio recordings of calls, converting spoken conversations into analyzable text data. This allows you to analyze call recordings alongside your other customer feedback sources within a single project.

Supported Audio Formats

Before uploading, ensure your audio file meets the following requirements:

File formats: MP4, M4V, FLAC, MP3, WAV

Maximum file size: 550 MB per file

Files larger than 550 MB cannot be processed and will need to be split into smaller segments before upload.

Uploading Audio Files

Audio files are uploaded using the same process as standard data files. The system automatically detects audio formats and routes them for transcription processing.

Step 1: Navigate to Your Project

Open the project where you want to add the transcribed call data. If you haven't created a project yet, see Creating a Project & Uploading Data.

Step 2: Add Your Audio File

Click Add Data and select your audio file from your computer. The system will validate the file format and size before uploading.

Step 3: Upload and Processing

Once you select a valid audio file:

- The file uploads to secure storage

- You'll receive a confirmation message: "Audio file uploaded. Transcription is processing in the background."

- The file status displays as "Transcribing"

- You can navigate away from the page while transcription continues

Step 4: Transcription Completion

Transcription processing time varies based on the audio file length. When complete:

- The file status changes from "Transcribing" to "Finished"

- The transcribed text becomes available in your project

- You can analyze the transcribed content using all standard Kapiche analysis features

Common Issues

File Size Limit Exceeded

Error message: "File exceeds size limits. File: [size] MB, Limit: 550 MB"

Resolution: Use audio editing software to split the recording into segments under 550 MB, or re-encode the audio at a lower bitrate to reduce file size.

Unsupported File Format

Error message: "File format not supported"

Resolution: Convert your audio file to one of the supported formats (MP4, M4V, FLAC, MP3, or WAV) using audio conversion software.

Extended Processing Time

Transcription processing time scales with audio length. As a general guideline, a 60-minute call typically processes within several minutes. If your file remains in "Transcribing" status significantly longer than expected, contact support.

Understanding Your Transcribed Data

Output Format

When transcription completes, Kapiche generates a CSV file containing your transcribed text and metadata. The original audio filename is preserved with `_transcript.csv` appended.

Example: `customer_call.mp3` becomes `customer_call_transcript.csv`

Standard Output Fields

All transcribed files include these fields:

call_transcript: The complete transcribed conversation with speaker labels. Each speaker's dialogue is formatted as separate lines:

Agent: How can I help you today?
Customer: I'm calling about my recent order.
Agent: I'd be happy to help with that.

filename: Your original audio filename, for traceability

format: The detected filename format (see Filename Conventions below)

transcription_id: Unique identifier for the transcription

date: Recording date extracted from your filename

Speaker Identification

Kapiche automatically identifies and labels speakers in your calls:

- Agent: Customer service or sales representatives or Agent 1 and Agent 2 if there is transfers.

- Customer: The caller

- Interactive Voice Response: Automated IVR systems

- Bot: If there is a voice AI agent used in calls

This speaker identification allows you to analyze agent responses separately from customer feedback.

Filename Conventions for Enhanced Metadata

Using standardized filename formats enables Kapiche to extract additional metadata from your call recordings.

Five9 Format

If your files follow Five9 naming conventions, Kapiche automatically extracts additional fields:

Format: `phone_number by agent@email.com @ H_MM_SS AM/PM.wav`

Example: `8284931690 by roger.smith@example.com @ 2_07_35 PM.wav`

Extracted fields:

- phone_number

- agent_email

- start_time

- date

CXone Format

For CXone (NICE inContact) recordings:

Format: `CXone recording_Agent Name_YYYY-MM-DD_HH-MM[UTC]_uuid.mp4`

Example: `CXone recording_Leeann Pomeroy-Jones_2025-04-09_07-22[UTC]_abc123.mp4`

Extracted fields:

- agent_name (full name)

- agent_display (first name + last initial)

- date

- datetime

Generic Date Extraction

If your files don't match Five9 or CXone formats, Kapiche attempts to extract dates using these patterns:

Unix timestamp: `1609459200000.wav` (milliseconds since epoch)

ISO date format: `recording_2023-12-25.mp4`

Underscore format:** `audio_2023_12_25.wav`

Compact format: `call_20231225.wav`

If no date pattern is detected, Kapiche uses the upload date.

Best Practices

Audio quality: Clear audio with minimal background noise produces more accurate transcriptions. While we make our best efforts to ensure maximum efficacy, transcription accuracy depends on recording quality.

File organization: Use descriptive file names to identify calls easily within your project (e.g., "Customer-Support-Call-2024-01-15.mp3").

Multiple recordings: You can upload multiple audio files to the same project. Each file is transcribed independently and added to your project data.

---

*If you encounter issues uploading audio files for transcription, contact support with the file format, file size, and any error messages received.*

Did this answer your question?