Skip to main content

Google BigQuery integration

Google BigQuery Integration

C
Written by Cameron Parry
Updated over 3 weeks ago

Kapiche can connect directly to your Google BigQuery instance, allowing you to securely analyze data without duplicating it.

We support two integration scenarios:


Scenario 1: Customer-Hosted BigQuery

In this setup, Kapiche connects directly to your existing BigQuery project using Workload Identity Federation. This method avoids long-lived credentials and uses short-lived tokens for enhanced security.

Required Permissions

Grant the following roles to the Kapiche IAM service account:

Service Account:

botanic-api-prod@kapiche-all.iam.gserviceaccount.com

Roles:

BigQuery Data Viewer – to read data from your datasets

BigQuery Job User – to run queries on your datasets

Service Account Token Creator – to allow authentication via Workload Identity Federation

💡 We don’t use service account keys. Instead, we use federated identity to impersonate your service account with short-lived credentials.


How to Grant Access

Step 1: Grant Dataset Access

1. In the Google Cloud Console, go to IAM & Admin > IAM

2. Click + Grant Access

3. In the New principals field, enter:

botanic-api-prod@kapiche-all.iam.gserviceaccount.com

4. Click Add another role and select BigQuery Data Viewer

5. Click Add another role again and select BigQuery Job User

6. Click Save

Step 2: Grant Service Account Token Creator Role

Kapiche needs permission to impersonate your BigQuery-enabled service account.

1. Identify the service account in your project that has BigQuery access (or create one for this purpose)

2. In the Cloud Console, navigate to IAM & Admin > Service Accounts

3. Click the name of the relevant service account

4. Go to the Permissions tab

5. Click Grant Access

6. In New principals, enter:

botanic-api-prod@kapiche-all.iam.gserviceaccount.com

7. For the role, select Service Account Token Creator

8. Click Save

This enables Kapiche to generate short-lived tokens to securely access your datasets on demand.

🔒 If your organization has stricter security or compliance requirements, we’re happy to work with your cloud security team to implement custom roles or more granular permissions.


Scenario 2: BigQuery Data Transfer Service

If you’d prefer not to grant direct access to your BigQuery project, we can use the BigQuery Data Transfer Service to securely move data into Kapiche’s BigQuery environment.

This method is also managed entirely within Google Cloud infrastructure.

Required Permissions

Grant the following roles to the BigQuery Data Transfer Service service account:

Service Account:

service-545996148440@gcp-sa-bigquerydatatransfer.iam.gserviceaccount.com

Roles:

BigQuery Data Viewer – to read from your datasets

BigQuery Job User – to run queries needed for transfer

How the Transfer Works

• Kapiche will schedule a recurring transfer based on your needs

• You can define which tables or views are transferred

• Supports incremental updates and configurable data freshness

• All transfer actions are fully audit logged


Alternative: Customer-Managed Transfer

If preferred, your team can create and manage the BigQuery Data Transfer Service job. In this case, simply provide Kapiche access to the destination dataset that receives the transferred data.

Did this answer your question?