Properties reference: BigQuery Connector

This topic lists all properties that you can set to configure the stage.

Connection

Credentials file
Specify full path to the file containing the Google service account credentials.
  • Type: string
Project id
Specify the id of the Google project to connect to. This is an optional property. If this property is not specified, project id from the credentials file will be used during connection.
  • Type: string
Use Proxy
Select Yes to use Proxy server
  • Type: boolean
  • Default: false
Proxy host
Enter the host name of the Proxy server
  • Type: string
Proxy port
Enter the port for Proxy server
  • Type: integer
Proxy user name
Enter the name of the user to connect to Proxy
  • Type: string
Proxy password
Enter the password for the specified user
  • Type: protected string

Usage

Write mode
The mode for writing records to the target table
  • Type: selection
  • Default: Insert
  • Values:
    • Insert
    • Update then Insert
    • Update
    • Delete
    • Delete then Insert
Generate SQL at run time
Select Yes to automatically generate the SQL statements at run time.
  • Type: boolean
  • Default: true
Select statement
Enter a SELECT statement to read rows from the database.
  • Type: string
Enable partitioned reads
Select Yes to run the statement on each processing node. The [[node-number]], [[node-number-base-one]] and [[node-count]] placeholders in the statement are replaced on each processing node with the actual zero-based node index, one-based node index and total number of nodes, respectively.
  • Type: boolean
  • Default: false
Use GCS staging
Select Yes to use Google Cloud Storage as staging area while executing the select statement, to improve performance.
  • Type: boolean
  • Default: false
Database name
Specify the Google project id where the the table resides. This is an optional property. If this property is not specified, project id from the connection will be used for operations. When GCS staging area is selected during read, the temporary staging table will be created in this project id.
  • Type: string
Schema name
Specify the name of the BigQuery dataset that contains the table. When GCS staging area is selected during read, the temporary staging table will be created in this schema.
  • Type: string
Table name
The name of the table
  • Type: string
User-defined SQL statement
Provide a SQL statement containing the temporary staging table TEMP_EXTERNAL_TABLE to write rows to the table. The data from the input link will be written to the temporary table and the job fails to operate if TEMP_EXTERNAL_TABLE is missing in the SQL statement. Example: insert into bqtest.testGCS select * from TEMP_EXTERNAL_TABLE
  • Type: string
Table action
The action to take on the target table to handle the new data set
  • Type: selection
  • Default: Append
  • Values:
    • Append
    • Replace
Key column names
A comma separated list of column names to override the primary key defined in the schema during merge operations. This is an optional property which can be used to specify key column names during the following write modes: Update then Insert, Update, Delete, Delete then Insert. This property is recommended when Runtime Column Propagation is selected across the stages.
  • Type: string
Google cloud storage bucket
Provide the bucket name to store temporary Google cloud storage files created during merge operations or user-defined read operations. This property is required for the following write modes: Update then Insert, Update, Delete, Delete then Insert and when Use GCS staging is selected during read.
  • Type: string
File name prefix
Provide the file name prefix of the temporary file to be created under the Google Cloud Storage bucket used to perform merge or user-defined read operations. The file will be automatically deleted at the end of the job. The default value is bigQueryTempFile. This is an optional property required for the following write modes: Update then Insert, Update, Delete, Delete then Insert and when Use GCS staging is selected during read.
  • Type: string
  • Default: bigQueryTempFile
File part size
Specify the part size in MB at which a file gets split. The default value is 50. This property can be adjusted accordingly to achieve higher performance for large data. Heap size property should be modified based on part size used. If a larger file part size is specified, heap size should be increased accordingly.
  • Type: integer
  • Default: 50
File part size
Specify the part size in MB at which a file gets split. The default value is 50. This property can be adjusted accordingly to achieve higher performance for large data. Heap size property should be modified based on part size used. If a larger file part size is specified, heap size should be increased accordingly.
  • Type: integer
  • Default: 50
Row limit
The maximum number of rows to return.
  • Type: string
Byte limit
The maximum number of bytes to return. Use any of these suffixes: KB, MB, GB, or TB
  • Type: string
Skip After SQL on job abort
Select Yes to skip the execution of the After SQL statements when the job is aborted.
  • Type: boolean
  • Default: false
Before SQL
Enter a SQL statement to be executed once before any data is processed
  • Type: string
Fail on error
Select Yes to stop the job if the Before SQL statement fails.
  • Type: boolean
  • Default: true
After SQL
Enter a SQL statement to be executed once after all data is processed
  • Type: string
Fail on error
Select Yes to stop the job if the After SQL statement fails.
  • Type: boolean
  • Default: true
Before SQL (node)
Enter a SQL statement to be executed once on each node before any data is processed
  • Type: string
Fail on error
Select Yes to stop the job if the Before SQL (node) statement fails.
  • Type: boolean
  • Default: true
After SQL (node)
Enter a SQL statement to be executed once on each node after all data is processed
  • Type: string
Fail on error
Select Yes to stop the job if the After SQL (node) statement fails.
  • Type: boolean
  • Default: true
Java settings
Properties for specifying JVM options
  • Type: category
Heap size
Heap size(MB). This property corresponds to the -Xmx command line option.
  • Type: integer
  • Default: 256
  • Minimum: 128
JVM options
Enter additional command line arguments to the Java Virtual Machine.
  • Type: string