DB2 Version 10.1 for Linux, UNIX, and Windows

Scenario: Processing a stream of files with the ingest utility

The following scenario shows how you can configure your data warehouse to automatically ingest an ongoing stream of data files.

The problem: In some data warehouses, files arrive in an ongoing stream throughout the day and need to be processed as they arrive. This means that each time a new file arrives, another INGEST command needs to be run specifying the new file to process.

The solution: You can write a script that automatically checks for new files, generates a new INGEST command, and runs that command. The ingest_files.sh is a sample of such a script. You also need to create a crontab entry in order to specify how frequently the shell script is supposed to run.

Before the user implements this mechanism (that is, the script and the chrontab entry) for processing the stream of files, the user needs to have met the following prerequisites and dependencies:
  1. The user creates a new script, using ingest_files.sh as a template by doing the following:
    1. Replace the following sample input values to reflect the user's values:
      • INPUT_FILES_DIRECTORY
      • DATABASE_NAME
      • SCHEMA_NAME
      • TABLE_NAME
      • SCRIPT_PATH
    2. Replace the sample INGEST command
    3. Save the script as populate_table1_script
  2. The user adds an entry to the crontab file to specify how frequently the script is to run. Because the user wants the script to run once a minute, 24 hours a day, every day of the year, the user adds the following line:
    1 * * * * $HOME/bin/populate_table1_script
  3. The user tests the script by creating new input files and adding them to the source directory.