Properties reference: BigQuery Connector
This topic lists all properties that you can set to configure the stage.
Connection
- Credentials file
- Specify full path to the file containing the Google service account credentials.
- Type: string
- Project id
- Specify the id of the Google project to connect to. This is an optional property. If this property is not specified, project id from the credentials file will be used during connection.
- Type: string
- Use Proxy
- Select Yes to use Proxy server
- Type: boolean
- Default: false
- Proxy host
- Enter the host name of the Proxy server
- Type: string
- Proxy port
- Enter the port for Proxy server
- Type: integer
- Proxy user name
- Enter the name of the user to connect to Proxy
- Type: string
- Proxy password
- Enter the password for the specified user
- Type: protected string
Usage
- Write mode
- The mode for writing records to the target table
- Type: selection
- Default: Insert
- Values:
- Insert
- Update then Insert
- Update
- Delete
- Delete then Insert
- Generate SQL at run time
- Select Yes to automatically generate the SQL statements at run time.
- Type: boolean
- Default: true
- Select statement
- Enter a SELECT statement to read rows from the database.
- Type: string
- Enable partitioned reads
- Select Yes to run the statement on each processing node. The [[node-number]], [[node-number-base-one]] and [[node-count]] placeholders in the statement are replaced on each processing node with the actual zero-based node index, one-based node index and total number of nodes, respectively.
- Type: boolean
- Default: false
- Use GCS staging
- Select Yes to use Google Cloud Storage as staging area while executing the select statement, to improve performance.
- Type: boolean
- Default: false
- Database name
- Specify the Google project id where the the table resides. This is an optional property. If this property is not specified, project id from the connection will be used for operations. When GCS staging area is selected during read, the temporary staging table will be created in this project id.
- Type: string
- Schema name
- Specify the name of the BigQuery dataset that contains the table. When GCS staging area is selected during read, the temporary staging table will be created in this schema.
- Type: string
- Table name
- The name of the table
- Type: string
- User-defined SQL statement
- Provide a SQL statement containing the temporary staging table TEMP_EXTERNAL_TABLE to write rows to the table. The data from the input link will be written to the temporary table and the job fails to operate if TEMP_EXTERNAL_TABLE is missing in the SQL statement. Example: insert into bqtest.testGCS select * from TEMP_EXTERNAL_TABLE
- Type: string
- Table action
- The action to take on the target table to handle the new data set
- Type: selection
- Default: Append
- Values:
- Append
- Replace
- Key column names
- A comma separated list of column names to override the primary key defined in the schema during merge operations. This is an optional property which can be used to specify key column names during the following write modes: Update then Insert, Update, Delete, Delete then Insert. This property is recommended when Runtime Column Propagation is selected across the stages.
- Type: string
- Google cloud storage bucket
- Provide the bucket name to store temporary Google cloud storage files created during merge operations or user-defined read operations. This property is required for the following write modes: Update then Insert, Update, Delete, Delete then Insert and when Use GCS staging is selected during read.
- Type: string
- File name prefix
- Provide the file name prefix of the temporary file to be created under the Google Cloud Storage bucket used to perform merge or user-defined read operations. The file will be automatically deleted at the end of the job. The default value is bigQueryTempFile. This is an optional property required for the following write modes: Update then Insert, Update, Delete, Delete then Insert and when Use GCS staging is selected during read.
- Type: string
- Default: bigQueryTempFile
- File part size
- Specify the part size in MB at which a file gets split. The default value is 50. This property can be adjusted accordingly to achieve higher performance for large data. Heap size property should be modified based on part size used. If a larger file part size is specified, heap size should be increased accordingly.
- Type: integer
- Default: 50
- File part size
- Specify the part size in MB at which a file gets split. The default value is 50. This property can be adjusted accordingly to achieve higher performance for large data. Heap size property should be modified based on part size used. If a larger file part size is specified, heap size should be increased accordingly.
- Type: integer
- Default: 50
- Row limit
- The maximum number of rows to return.
- Type: string
- Byte limit
- The maximum number of bytes to return. Use any of these suffixes: KB, MB, GB, or TB
- Type: string
- Skip After SQL on job abort
- Select Yes to skip the execution of the After SQL statements when the job is aborted.
- Type: boolean
- Default: false
- Before SQL
- Enter a SQL statement to be executed once before any data is processed
- Type: string
- Fail on error
- Select Yes to stop the job if the Before SQL statement fails.
- Type: boolean
- Default: true
- After SQL
- Enter a SQL statement to be executed once after all data is processed
- Type: string
- Fail on error
- Select Yes to stop the job if the After SQL statement fails.
- Type: boolean
- Default: true
- Before SQL (node)
- Enter a SQL statement to be executed once on each node before any data is processed
- Type: string
- Fail on error
- Select Yes to stop the job if the Before SQL (node) statement fails.
- Type: boolean
- Default: true
- After SQL (node)
- Enter a SQL statement to be executed once on each node after all data is processed
- Type: string
- Fail on error
- Select Yes to stop the job if the After SQL (node) statement fails.
- Type: boolean
- Default: true
- Java settings
- Properties for specifying JVM options
- Type: category
- Heap size
- Heap size(MB). This property corresponds to the -Xmx command line option.
- Type: integer
- Default: 256
- Minimum: 128
- JVM options
- Enter additional command line arguments to the Java Virtual Machine.
- Type: string