Configuring the Amazon S3 connector as a source

To configure the connector to read Amazon S3 data or list Amazon S3 buckets and files, you must specify a read mode and configure properties for the read mode that you specified.

Procedure

  1. From the job design canvas, double-click the Amazon S3 Connector stage.
  2. Set the Read mode property to Read single file, Read multiple files, List buckets, or List files.
  3. Configure the read process for the read mode that you specified.
    Table 1. Reading data from Amazon S3
    Read mode Procedure
    Read single file
    1. Specify the name of the bucket that contains the files.
    2. Specify the name of the file to read.
    Read multiple files
    1. Specify the name of the bucket that contains the files.
    2. In the File name field, specify a prefix that the files that you want to read must have in their file path.

      For example, if you enter transactions as the prefix, the connector reads all of the files in the transactions folder, such as transactions/january/day1.txt, and a file named transactions.txt.

    List buckets No additional configuration is required.
    List files
    1. Specify the name of the bucket that contains the files.
    2. Optional: In the File name field, specify a prefix that the files that you want to read must have in their file path.

      For example, if you enter transactions as the prefix, the connector reads all of the files in the transactions folder, such as transactions/january/day1.txt, and a file named transactions.txt.

      If you do not specify a file name prefix, all of the files in the bucket are listed.

  4. Click OK, and then save the job.