Reading and writing files environment variables in DataStage
These environment variables are concerned with reading and writing files.
APT_DELIMITED_READ_SIZE environment variable in DataStage
Set the APT_DELIMITED_READ_SIZE environment variable to control how many bytes DataStage® reads ahead to get the next delimiter.
By default, the DataStage reads ahead 500 bytes to get the next delimiter. For streaming inputs (socket, FIFO, and others) this behavior is suboptimal, since the DataStage might block (and not output any records). When DataStage reads a delimited record, it reads this many bytes (minimum legal value is 2) instead of 500. If a delimiter is NOT available within N bytes, N is incremented by a factor of 2 (when this environment variable is not set, this factor changes to 4).
APT_FILE_IMPORT_BUFFER_SIZE environment variable in DataStage
Set the APT_FILE_IMPORT_BUFFER_SIZE environment variable to specify the size, in kilobytes, of the buffer for reading in files.
The default is 128 (that is, 128 KB). It can be set to values from 8 upward, but is clamped to a minimum value of 8. That is, if you set it to a value less than 8, then 8 is used. Tune this value upward for long-latency files (typically from heavily loaded file servers).
APT_FILE_EXPORT_BUFFER_SIZE environment variable in DataStage
Set the APT_FILE_EXPORT_BUFFER_SIZE environment variable to specify the size, in kilobytes, of the buffer for writing to files.
The default is 128 (that is, 128 KB). It can be set to values from 8 upward, but is clamped to a minimum value of 8. That is, if you set it to a value less than 8, then 8 is used. Tune this upward for long-latency files (typically from heavily loaded file servers).
APT_IMPORT_FILE_PATTERN_CMD environment variable in DataStage
Set this environment variable to use commands for file import patterns for filesets.
APT_IMPORT_HANDLE_SHORT environment variable in DataStage
Set the APT_IMPORT_HANDLE_SHORT environment variable so that the import operator successfully imports records that do not contain all the fields in the import schema.
By default, records that do not contain all the fields in the import schema are rejected by the import operator. Set the APT_IMPORT_HANDLE_SHORT environment variable so that these records are imported successfully. The missing fields are given the default value for their type, or null if the field is nullable.
Setting the APT_IMPORT_HANDLE_SHORT environment variable disables optimization for fixed-length schemas by the copy operator, which might cause slower performance for some fixed-length schemas.
The APT_IMPORT_HANDLE_SHORT environment variable is not supported for importing fixed-length records without a record limiter, as incorrect data might be imported.
APT_IMPORT_PATTERN_USES_CAT environment variable in DataStage
Set the APT_IMPORT_PATTERN_USES_CAT environment variable so that the DataStage> Sequential File stage concatenates all the files that match a file pattern before importing the files.
In DataStage Version 11.3 the behavior of the Sequential File stage was changed so that a file pattern is converted into a fileset before the files are imported. This change allows the files to be processed in parallel rather than sequentially.
Set the APT_IMPORT_PATTERN_USES_CAT environment variable to force the behavior of prior versions of DataStage. This prior behavior was to concatenate all the files that match a file pattern before importing the files sequentially.
APT_IMPORT_PATTERN_USES_FILESET_MOUNTED environment variable in DataStage
Set the APT_IMPORT_PATTERN_USES_FILESET_MOUNTED environment variable to expand the nodes that are used by the DataStage Sequential File stage if the file pattern indicates that the files are on a mounted file system.
If the APT_IMPORT_PATTERN_USES_FILESET_MOUNTED environment variable is set and the pattern indicates that the files are on a mounted file system, all the available nodes are used for the file set.
APT_MAX_DELIMITED_READ_SIZE environment variable in DataStage
Set the APT_MAX_DELIMITED_READ_SIZE environment variable to specify the upper bound for the number of bytes DataStage looks ahead to the next delimiter.
By default, when reading, DataStage will read ahead 500 bytes to get the next delimiter. If it is not found, DataStage looks ahead 4*500=2000 (1500 more) bytes, and so on (4X) up to 100,000 bytes. This variable controls the upper bound which is by default 100,000 bytes. Use this variable instead of APT_DELIMITED_READ_SIZE when a larger than 500 bytes read-ahead is wanted.
APT_STRING_ALLPADS_NOT_EMPTY environment variable in DataStage
Set APT_STRING_ALLPADS_NOT_EMPTY if you do not want APT_STRING_PADCHAR characters to be treated as an empty string.
By default, if a fixed-length string contains all characters it is treated as an empty string. Defining this environment variable overrides the default behavior. For example, even if the fixed length string contains all characters, it is treated as a nonempty string.
APT_STRING_PADCHAR environment variable in DataStage
Set the APT_STRING_PADCHAR environment variable to override the pad character of 0x0 (ASCII null), used by default when DataStage extends, or pads, a string field to a fixed length.
- The input string is fixed length.
- The output string is variable length.
APT_EXPORT_DECIMAL_PLUS_SIGN environment variable in DataStage
It changes Sequential File export default behavior from writing a space before positive decimals to writing a plus character. Its default value is False. This variable has a Boolean type.
APT_IMPORT_NOWARN_STRING_FIELD_OVERRUNS environment variable in DataStage
It creates quiet warnings during reading string or unstring fields that are longer than the defined maximum length. Its default value is False. This variable has a Boolean type.
APT_IMPORT_PATTERN_USES_FILESET environment variable in DataStage
When APT_IMPORT_PATTERN_USES_FILESET is set, sequential read (import) turns any file pattern into a fileset before processing the files. It allows the files to be processed in parallel as apposed to sequentially. By default, file pattern processing concatenates the files to be used as the input. Its default value is False. This variable has a Boolean type.
APT_IMPORT_PATTERN_USES_FIND environment variable in DataStage
By default import turns any file pattern into a fileset before processing the files by using the ls command to get the list of files. Setting this environment variable will instead use the find command to get the list of files. It can be used to work around limitations with ls on systems with low ARG_MAX settings which limit the number of files a pattern can read. Find behavior is not identical to ls so the order of files and processing of wild cards can differ. Its default value is False. This variable has a Boolean type.
APT_NOSPACE_IN_IMPORT_PATTERN environment variable in DataStage
It disables the feature of importing files containing spaces in their names using file patterns. Its default value is False. This variable has a Boolean type.
APT_PREVIOUS_FINAL_DELIM_COMPATIBLE environment variable in DataStage
It enables final_delimitor behavior as implemented prior to DataStage 7.0.1. Its default value is False. This variable has a Boolean type.
APT_S3_READ_SIZE environment variable in DataStage
It defines the size of the data that is read from S3 in a single chunk. Its default value is “500”. This variable has a Number type.
APT_S3_UPLOAD_PART_SIZE environment variable in DataStage
It defines the size of the part uploaded to S3 as a single chunk. Its default value is “500”. This variable has a Number type.