IBM Support

JR63409: IMPROVE PERFORMANCE OF BIGQUERY CONNECTOR READ WHEN USING SELECTSTATEMENT

Subscribe to this APAR

By subscribing, you receive periodic emails alerting you to the status of the APAR, along with a link to the fix after it becomes available. You can track this item individually or track all items by product.

Notify me when this APAR changes.

Notify me when an APAR for this component changes.

 

APAR status

  • Closed as new function.

Error description

  • Improve the performance of BigQuery connector read when using
    a select statement, by providing a new GCS staging option.
    

Local fix

  • N/A
    

Problem summary

  • Improve performance of BigQuery connector read
    when using select statement
    

Problem conclusion

  • The following properties are added in the
    source context to improve read performance
    when "Generate SQL" is set to No.
    
    1. Use GCS staging - Yes/No.
    2. Database name - This is an optional property.
    If it is specified, temporary staging table gets
    created under this project id.
    3. Schema name - This is a mandatory property.
    The temporary staging table will be created
    under this schema.
    4. Google cloud storage bucket - This is a
    mandatory property, required to perform read
    operation using GCS staging option. This bucket
    is used as staging area to store temporary files
    created during this read process.
    5. File name prefix ? This is an optional property
    to specify the prefix of the temporary filename
    created in the Google cloud storage bucket.
    6. File part size - This is an optional integer
    property to specify the part size in MB at
    which a file gets split. The default value is 50.
    This property can be adjusted accordingly to
    achieve higher performance for larger datasets.
    Heap size property should be modified based on
    part size used. If a larger file part size is
    specified, heap size should be increased accordingly.
    
    Along with the above properties, an additional
    property is added on the target side to improve
    write performance:
    
    1. File part size - This is an optional integer
    property to specify the part size in MB at which
     a file gets split. The default value is 50. This
    property can be adjusted accordingly to achieve
    higher performance for larger datasets. Heap size
    property should be modified based on part size used.
    If a larger file part size is specified, heap size
    should be increased accordingly.
    
    Note:
    1. Use GCS staging option is recommended when
    select statement returns large number of records.
    2. Recommended File part size is 50. The value
    can be tuned accordingly based on the total record
    size to improve job performance.
    
    Limitation:
    Currently, decimal and numeric datatypes are not
    supported using GCS staging approach.
    
    This APAR also includes the changes for the following:
    1. Increase data block size when writing to BigQuery
       - using File part size property
    2. Report total rows modified after the Before/After
    SQL statement execution.
    

Temporary fix

Comments

APAR Information

  • APAR number

    JR63409

  • Reported component name

    WIS DATASTAGE

  • Reported component ID

    5724Q36DS

  • Reported release

    B71

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2021-03-05

  • Closed date

    2021-03-18

  • Last modified date

    2021-03-18

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    WIS DATASTAGE

  • Fixed component ID

    5724Q36DS

Applicable component levels

[{"Line of Business":{"code":"LOB10","label":"Data and AI"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSVSEF","label":"IBM InfoSphere DataStage"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"11.7"}]

Document Information

Modified date:
19 March 2021