Google BigQuery connection

To access your data in Google BigQuery, you must create a connection asset for it.

Google BigQuery is a fully managed, serverless data warehouse that enables scalable analysis over petabytes of data.

Create a connection to Google BigQuery

To create the connection asset, choose an authentication method. Choices include authentication with or without workload identity federation.

Without workload identity federation

  • Credentials: The contents of the Google service account key JSON file
  • Client ID, Client secret, Access token, and Refresh token

With workload identity federation
You use an external identity provider (IdP) for authentication. An external identity provider uses Identity and Access Management (IAM) instead of service account keys. IAM provides increased security and centralized management. You can use workload identity federation authentication with an access token or with a token URL.

You can configure a Google BigQuery connection for workload identity federation with any identity provider that complies with the OpenID Connect (OIDC) specification and that satisfies the Google Cloud requirements that are described in Prepare your external IdP. The requirements include:

  • The identity provider must support OpenID Connect 1.0.
  • The identity provider's OIDC metadata and JWKS endpoints must be publicly accessible over the internet. Google Cloud uses these endpoints to download your identity provider's key set and uses that key set to validate tokens.
  • The identity provider is configured so that your workload can obtain ID tokens that meet these criteria:
    • Tokens are signed with the RS256 or ES256 algorithm.
    • Tokens contain an aud claim.

For examples of the workload identity federation configuration steps and the Google BigQuery connection details for Amazon Web Services (AWS) and Microsoft Azure, see Workload identity federation examples.

Workload Identity Federation with access token connection details

  • Access token: An access token from the identity provider to connect to BigQuery.

  • Security Token Service audience: The security token service audience that contains the project ID, pool ID, and provider ID. Use this format:

    //iam.googleapis.com/projects/PROJECT_NUMBER/locations/global/workloadIdentityPools/POOL_ID/providers/PROVIDER_ID
    

    For more information, see Authenticate a workload using the REST API.

  • Service account email: The email address of the Google service account to be impersonated. For more information, see Create a service account for the external workload.

  • Service account token lifetime (optional): The lifetime in seconds of the service account access token. The default lifetime of a service account access token is one hour. For more information, see URL-sourced credentials.

  • Token format: Text or JSON with the Token field name for the name of the field in the JSON response that contains the token.

  • Token field name: The name of the field in the JSON response that contains the token. This field appears only when the Token format is JSON.

  • Token type: AWS Signature Version 4 request, Google OAuth 2.0 access token, ID token, JSON Web Token (JWT), or SAML 2.0.

Workload Identity Federation with token URL connection details

  • Security Token Service audience: The security token service audience that contains the project ID, pool ID, and provider ID. Use this format:

    //iam.googleapis.com/projects/PROJECT_NUMBER/locations/global/workloadIdentityPools/POOL_ID/providers/PROVIDER_ID
    

    For more information, see Authenticate a workload using the REST API.

  • Service account email: The email address of the Google service account to be impersonated. For more information, see Create a service account for the external workload.

  • Service account token lifetime (optional): The lifetime in seconds of the service account access token. The default lifetime of a service account access token is one hour. For more information, see URL-sourced credentials.

  • Token URL: The URL to retrieve a token.

  • HTTP method: HTTP method to use for the token URL request: GET, POST, or PUT.

  • Request body (for POST or PUT methods): The body of the HTTP request to retrieve a token.

  • HTTP headers: HTTP headers for the token URL request in JSON or as a JSON body. Use format: "Key1"="Value1","Key2"="Value2".

  • Token format: Text or JSON with the Token field name for the name of the field in the JSON response that contains the token.

  • Token field name: The name of the field in the JSON response that contains the token. This field appears only when the Token format is JSON.

  • Token type: AWS Signature Version 4 request, Google OAuth 2.0 access token, ID token, JSON Web Token (JWT), or SAML 2.0.

Other properties

Project ID (optional)

Permissions

The connection to Google BigQuery requires the following BigQuery permissions:

  • bigquery.job.create
  • bigquery.tables.get
  • bigquery.tables.getData

Use one of three ways to gain these permissions:

  • Use the predefined BigQuery Cloud IAM role bigquery.admin, which includes these permissions;
  • Use a combination of two roles, one from each column in the following table; or
  • Create a custom role. For more information, see Create and manage custom roles.
First role Second role
bigquery.dataEditor bigquery.jobUser
bigquery.dataOwner bigquery.user
bigquery.dataViewer

For information about permissions and roles in Google BigQuery, see Predefined roles and permissions.

Choose the method for creating a connection based on where you are in the platform

In a project
Click Assets > New asset > Data access tools > Connection. See Adding a connection to a project.
In a catalog
Click Add to catalog > Connection. See Adding a connection asset to a catalog.
In a deployment space
Click Add to space > Connection. See Adding connections to a deployment space.
In the Platform assets catalog
Click New connection. For more information, see Adding platform connections.

Next step: Add data assets from the connection

Where you can use this connection

You can use Google BigQuery connections in the following workspaces and tools:

Projects

  • AutoAI (Watson Machine Learning)

  • Data quality rules (Watson Knowledge Catalog)

  • Data Refinery (Watson Studio or Watson Knowledge Catalog)

  • DataStage (DataStage service). For more information, see Connecting to a data source in DataStage.

  • Metadata enrichment (Watson Knowledge Catalog)

  • Metadata import (Watson Knowledge Catalog). For information about the supported product versions and other prerequisites when connections are based on MANTA Automated Data Lineage for IBM Cloud Pak for Data scanners, see the Lineage Scanner Configuration section in the MANTA Automated Data Lineage on IBM Cloud Pak for Data Installation and Usage Manual. This documentation is available at https://www.ibm.com/support/pages/node/6597457.

    For metadata import (lineage), advanced metadata import must be enabled and a MANTA Automated Data Lineage license key must be installed. See Installing Watson Knowledge Catalog, Installing MANTA Automated Data Lineage, and Enabling lineage import.

  • SPSS Modeler (SPSS Modeler service)

Catalogs

  • Platform assets catalog

  • Other catalogs (Watson Knowledge Catalog)

Watson Query service
You can connect to this data source from Watson Query. This connection requires special consideration in Watson Query. For more information, see Connecting to Google BigQuery in Watson Query.

Federal Information Processing Standards (FIPS) compliance

The Google BigQuery connection is compliant with FIPS.

Google BigQuery setup

Quickstart by using the Cloud Console

Learn more

Parent topic: Supported connections