Unloading data to cloud: AWS S3 and IBM Cloud Object Storage

Deployment options: Netezza Performance Server for Cloud Pak for Data SystemNetezza Performance Server for Cloud Pak for Data

You can unload data from database tables on a file on cloud storage. This unload operation does not remove rows from the database but instead stores the unloaded data in an external table that is suitable for loading back into a database.

About this task

You can unload data to any of the supported cloud platforms, which include AWS S3 and IBM Cloud Object Storage. You can unload all data types (including Unicode) and file types (decompressed and compressed formats).

Unloading for the fixed-length format is not supported.

To create an external table, you must be the admin user or have the CREATE EXTERNAL TABLE administration privilege. You must also have permission to write to the path of the data object.

Procedure

  1. Create an external table by issuing the CREATE EXTERNAL TABLE command.
    CREATE EXTERNAL TABLE TABLE SAMEAS TABLE USING (
    DATAOBJECT DATAOBJECT
    REMOTESOURCE REMOTESOURCE
    DELIM DELIM
    UNIQUEID UNIQUEID
    ACCESSKEYID ACCESSKEYID
    SECRETACCESSKEY SECRETACCESSKEY
    DEFAULTREGION DEFAULTREGION
    BUCKETURL BUCKETURL
    ENDPOINT ENDPOINT
    MULTIPARTSIZEMB MULTIPARTSIZEMB 
    );
    Example:
    CREATE EXTERNAL TABLE emp_backup SAMEAS emp USING (
    DATAOBJECT ('/employee_data.dat')
    REMOTESOURCE 'S3' 
    DELIM '|' 
    UNIQUEID 'samplebackup' 
    ACCESSKEYID 'xxxxx'
    SECRETACCESSKEY 'xxxxx'
    DEFAULTREGION 'ap-geo' 
    BUCKETURL 'my.backup.bucket' 
    ENDPOINT 's3.us-east.cloud-object-storage.appdomain.cloud' 
    MULTIPARTSIZEMB '50'
    );

    The DATAOBJECT file specification must refer to a valid file on the cloud platform. For the REMOTESOURCE option, use s3.

    This table lists all the additional parameters, which are necessary if you want to use the REMOTESOURCE option.
    Option Distribution Description
    UNIQUEID Optional The namespace, which is used to group data in the cloud bucket.
    ACCESSKEYID Mandatory Key generated on AWS/IBM Cloud Object Storage.
    SECRETACCESSKEY Mandatory Secret access key generated on AWS/IBM Cloud Object Storage.
    DEFAULTREGION Mandatory Region of the bucket.
    BUCKETURL Mandatory Name of the bucket.
    MULTIPARTSIZEMB Optional The default is 105. This value is in MB. It indicates the size of each part in a multipart upload. (Max 105 MB of buffer can be uploaded in one request).
    Tip: Use values such as 50 MB and higher. If you use 10 MB, you might encounter software issues.
    ENDPOINT Optional for AWS

    Mandatory for IBM Cloud Object Storage

    T region URL to access your bucket.
  2. Insert data from the table on the Netezza Performance Server host into the external table.
    INSERT INTO TABLE SELECT * FROM TABLE;
    Example:
    INSERT INTO emp_backup SELECT * FROM emp;