Loading public data sets from Amazon S3 for IBM Db2 Warehouse

You can load public data sets to Db2® Warehouse from Amazon S3. This tutorial shows you how to load sfpd_incident_2013_no_hdr.csv, a collection of several year's worth of San Francisco crime data, into Db2 Warehouse.

Follow these steps to load data from a public data set

Create the table

Create a table in the web console using the following DDL:

CREATE TABLE SFPD_INCIDENT (
    INCIDNTNUM INTEGER, 
    CATEGORY VARCHAR(256), 
    DESCRIPT VARCHAR(256), 
    DAYOFWEEK CHAR(20), 
    DATE DATE, 
    TIME TIME, 
    PDDISTRICT VARCHAR(256), 
    RESOLUTION VARCHAR(256), 
    LOCATION VARCHAR(256), 
    X DECIMAL(31 , 6), 
    Y DECIMAL(31 , 6)
);

Load data

  1. Select Load from the main menu.
  2. Under Select a data source, click Amazon S3. Enter your S3 access key ID and S3 secret access key. Click Browse Files.
  3. From the ibm.dashdb.public folder, select the sfpd_incident_2013_no_hdr.csv data file, then click OK.
  4. In the Do any of the files have columns that contain dates or times? field, select Yes.
    Select the following values:
    • In the The files have columns that contain only dates. list, select MM/DD/YYYY.
    • In the The files have columns that contain only times. list, select HH:MM.
    • In the The files have columns that that contain both dates and times. list, select YYYY-MM-DD HH:MM:SS.

    Then click Next.

  5. In the Schema column, select your schema name. Your schema name is usually listed first. To verify your schema name, click Connect > Connection Information from the main menu to open the Connect your applications to the database page. The schema name will match the user ID listed on the page.

    In the Table column, select the SFPD_INCIDENT table that you created in step 5. Then select the Append new data into the table option, and click Next.

  6. Select Run now, then click Next.
  7. In the Set Notifications tab, click Finish to begin the load process.
  8. After you see the message that the load job was saved successfully, click View the progress of this load.
  9. Verify that the target table contains your data. From the main menu, click Tables. Select the schema and table name to view the table definition and browse data.
  10. Now, you can analyze your data.