Managing environment variables in DataStage

You can manage environment variables for IBM® DataStage® in several different ways.

In DataStage, you can manage environment variables in three different ways: at the project level, and the flow level, and by using the dsjob command-line interface (CLI).

Project level

To set up environment variables from the project level, complete the following steps:
  1. Open a project, then click the Manage tab.
  2. Click Environments > Templates > New template + or edit an existing template by opening it, clicking New environment variable + and adding a key-value pair.
  3. Specify the environment details and configuration.
  4. In the Environment variables field, specify runtime environment variables by using a name=value pair.
    For example:
    CC_MSG_LEVEL=1
    APT_CONFIG_FILE=/ds-storage/2nodes.apt
  5. Click Create.
  6. From a DataStage flow, click the Settings icon on the toolbar, then click Run on the Settings page.
  7. Select the environment that you created, then click Save.

DataStage flow level

To select environment variables from the flow level, complete the following steps:
  1. Open a DataStage flow.
  2. Click the Add parameters icon ({#}) on the toolbar.
  3. Select one or more environment variables from the list of available environment variables.
  4. Click Add, then click Return to canvas.

Command-line interface (CLI)

You can pass in environment variables from the CLI. See the following example:
cpdctl dsjob run --job "TestSimpleJavaWriteJSON.DataStage job" --project Project2021 --wait 300 --param FILE_NAME=/ds-storage/output/MySales5k.json --env "\$CC_MSG_LEVEL"="1"

For more information, see DataStage command-line tools.

Override rules

The following override rules apply to the environment variables:
  • dsjob CLI environment variables overwrite DataStage flow level environment variables.
  • DataStage flow-level environment variables overwrite runtime environment-level environment variables.

Environment variables that are defined by default

The following list contains environment variables that are defined by default.

APT_OLD_BOUNDED_LENGTH
  • Cloud Pak for Data setting: APT_OLD_BOUNDED_LENGHT = true
  • Cloud Pak for Data behavior: The way PX internally handles bounded length strings and raw fields was changed from treating the fields as variable length to fixed length. This was done to improve record processing performance. It is set to true to read written in the old variable length format data sets.
  • Traditional (11.7) default behavior setting: APT_OLD_BOUNDED_LENGTH = false
  • Behavior with Traditional (11.7) setting: Data sets written in old variable length format cannot be read.
APT_THREAD_SAFE_FAST_ALLOC
  • Cloud Pak for Data setting: APT_THREAD_SAFE_FAST_ALLOC = threadsafe
  • Cloud Pak for Data behavior: The value string specifies the threading mode for fast allocators, which are used to allocate commonly used objects such as strings: 'threadsafe'. Instances of fast allocators will be maintained within thread-specific storage.
  • Traditional (11.7) default behavior setting: APT_ THREAD_SAFE_FAST_ALLOC = legacy (or) false
  • Behavior with Traditional (11.7) setting: Fast-allocator instances are shared between threads.
APT_DONT_ALLOW_DOUBLE_TSORT_COMBINE
  • Cloud Pak for Data setting: APT_DONT_ALLOW_DOUBLE_TSORT_COMBINE = true
  • Cloud Pak for Data behavior: By default, combined operator controller merges as many tsort operators as possible. It is defined to true so that combined operator does not merge more than one tsort operator if the combined operator is followed by join stage in the downstream. This is added to avoid hanging scenarios that are involving tsort-join stages.
  • Traditional (11.7) default behavior setting: APT_ DONT_ALLOW_DOUBLE_TSORT_COMBINE = false
  • Behavior with Traditional (11.7) setting: Combined operator controller combines as many tsort operators as possible.
APT_SCRATCH_RESERVE_MB
  • Cloud Pak for Data setting: APT_SCRATCH_RESERVE_MB = 10
  • Cloud Pak for Data behavior: Operators like sort or buffer use scratch disk. If a minimum of free space is available in the scratch disk, the scratch disk is used for creating scratch files. If free space is less than minimum MB, the next scratch disk, which meeting the minimum free space criteria, is used. It is applied when multiple scratch disks are defined in APT_CONFIG_FILE.
  • Traditional (11.7) default behavior setting: APT_SCRATCH_RESERVE_MB = false
  • Behavior with Traditional (11.7) setting: Default scratch reserve space of 2GB is used.
APT_DISABLE_JOBMON_SCHEMA_STRING
  • Cloud Pak for Data setting: APT_DISABLE_JOBMON_SCHEMA_STRING = true
  • Cloud Pak for Data behavior: Disables sending schema information in linkstats messages to the job monitor.
  • Traditional (11.7) default behavior setting: APT_DISABLE_JOBMON_SCHEMA_STRING = false
  • Behavior with Traditional (11.7) setting: Sends schema information in linkstats. Currently Cloud Pak for Data does not process schema string from linkstats. Schema information can be disabled.
APT_DS_COMPRESSION
  • Cloud Pak for Data setting: APT_DS_COMPRESSION = true
  • Cloud Pak for Data behavior: The variable enables compression for data sets.
  • Traditional (11.7) default behavior setting: APT_DS_COMPRESSION = false
  • Behavior with Traditional (11.7) setting: Data sets are not compressed.
APT_IMPORT_FORCE_QUOTE_DELIM
  • Cloud Pak for Data setting: APT_IMPORT_FORCE_QUOTE_DELIM = true
  • Cloud Pak for Data behavior: By default, import for quoted fields looks for an opening and closing quote character. If the field data happens to contain the quote character then this character is incorrectly assumed to be the closing quote character. Setting this environment variable causes import to only recognize a closing quote character that is followed by the field's delimiter character. This change allows the field to contain embedded quote characters and import correctly.
  • Traditional (11.7) default behavior setting: APT_IMPORT_FORCE_QUOTE_DELIM = false
  • Behavior with Traditional (11.7) setting: Import on fields with embedded quote characters will not import the record. By default, a warning is thrown and records are rejected.
APT_TSORT_SCRATCH_COMPRESSION
  • Cloud Pak for Data setting: APT_TSORT_SCRATCH_COMPRESSION = true
  • Cloud Pak for Data behavior: Tsort scratch files are written to disk in compressed mode because of saving space requirement.
  • Traditional (11.7) default behavior setting: APT_TSORT_SCRATCH_COMPRESSION = false
  • Behavior with Traditional (11.7) setting: Tsort scratch files are written to disk without compression.
APT_DOWNGRADED_MESSAGES
  • Cloud Pak for Data setting: APT_DOWNGRADED_MESSAGES = "Picked up JAVA_TOOL_OPTIONS:"
  • Cloud Pak for Data behavior: Converts the severity of output orchestrate error or warning messages to Informational for message with string "Picked up JAVA_TOOL_OPTIONS:".
  • Traditional (11.7) default behavior setting: APT_DOWNGRADED_MESSAGES = false
  • Behavior with Traditional (11.7) setting: No messages are downgraded. However user can override with new values. The format of this variable is a list of full or partial error or warning messages those need to change their severity to inform without timestamp part. Each message is separated by a unique string - #?#.
  • Example: APT_DOWNGRADED_MESSAGES = READ is not supported in state standby#?#IPv6 is not currently supported.
APT_IMPEXP_INFER_EOL_RECORD_DELIM
  • Cloud Pak for Data setting: APT_IMPEXP_INFER_EOL_RECORD_DELIM = true
  • Cloud Pak for Data behavior: The sequential file import layer looks for Windows EOL record delimiters when Unix EOL record delimiters are set, and allow the file to be imported correctly.
  • Traditional (11.7) default behavior setting: APT_IMPEXP_INFER_EOL_RECORD_DELIM = false
  • Behavior with Traditional (11.7) setting: Sequential file import layer looks only for EOL record delimiter set in import properties.