IBM InfoSphere DataStage and InfoSphere QualityStage, Version 8.5

Built-in functions

This section defines functions that are provided by the Transformation Language. It is presented in a series of tables that deal with data transformation functions of the following types:

Lookup table functions
Data conversion functions
Mathematical functions
String field functions
Ustring field functions
Bit manipulation functions
Job monitoring functions
Miscellaneous functions

When a function generates an output value, it returns the result. For functions with optional arguments, simply omit the optional argument to accept the default value. Default conversions among integer, float, decimal, and numeric string types are supported in the input arguments and the return value of the function. All integers can be signed or unsigned.

The transform operator has default NULL handling at the record-level with individual field "overrides". Options can be entered at the record level or the field level.

Lookup table functions

Function	Description
lookup( `lookup_table` )	Performs a lookup on the table using the current input record. It fills the current record of the lookup table with the first record found. If a match is not found, the current record is empty. If this is called multiple times on the same record, the record is filled with the current match if there is one and a new lookup will not be done.
next_match( `lookup_table` )	Gets the next record matched in the lookup and puts it into the current record of the table.
clear_lookup( `lookup_table` )	Checks to see if the current lookup record has a match. If this method returns false directly after the lookup() call, no matches were found in the table. Returns a boolean value specifying whether the record is empty or not.
int8 is_match( `lookup_table` )	Checks to see if the current lookup record has a match. If this method returns false directly after the lookup() call, no matches were found in the table. Returns a boolean value specifying whether the record is empty or not.

Data conversion functions

date field functions

InfoSphere® DataStage® performs no automatic type conversion of date fields. Either an input data set must match the operator interface or you must effect a type conversion by means of the transform or modify operator.

A date conversion to or from a numeric field can be specified with any InfoSphere DataStage numeric data type. InfoSphere DataStage performs the necessary modifications and either translates a numeric field to the source data type or translates a conversion result to the numeric data type of the destination. For example, you can use the transformation function month_day_from_date() to convert a date to an int8, or to an int16, int32, dfloat, and so on

date format

The default format of the date contained in the string is yyyy-mm-dd. However, you can specify an optional format string that defines another format. The format string requires that you provide enough information for InfoSphere DataStage to determine a complete date (either day, month, and year, or year and day of year).

The format_string can contain one or a combination of the following elements:

Table 1. Date format tags
Tag	Variable width availability	Description	Value range	Options
%d	import	Day of month, variable width	1...31	s
%dd		Day of month, fixed width	01...31	s
%ddd	with v option	Day of year	1...366	s, v
%m	import	Month of year, variable width	1...12	s
%mm		Month of year, fixed width	01...12	s
%mmm		Month of year, short name, locale specific	Jan, Feb ...	t, u, w
%mmmm	import/export	Month of year, full name, locale specific	January, February ...	t, u, w, -`N`, +`N`
%yy		Year of century	00...99	s
%yyyy		Four digit year	0001 ...9999
%NNNNyy		Cutoff year plus year of century	yy = 00...99	s
%e		Day of week, Sunday = day 1	1...7
%E		Day of week, Monday = day 1	1...7
%eee		Weekday short name, locale specific	Sun, Mon ...	t, u, w
%eeee	import/export	Weekday long name, locale specific	Sunday, Monday ...	t, u, w, -`N`, +`N`
%W	import	Week of year (ISO 8601, Mon)	1...53	s
%WW		Week of year (ISO 8601, Mon)	01...53	s

When you specify a date format string, prefix each component with the percent symbol (%) and separate the string's components with a suitable literal character.

The default date_format is %yyyy-%mm-%dd.

Where indicated the tags can represent variable-width data elements. Variable-width date elements can omit leading zeroes without causing errors.

The following options can be used in the format string where indicated in the table:

s: Specify this option to allow leading spaces in date formats. The s option is specified in the form:; %(tag,s); Where tag is the format string. For example:; %(m,s); indicates a numeric month of year field in which values can contain leading spaces or zeroes and be one or two characters wide. If you specified the following date format property:; %(d,s)/%(m,s)/%yyyy; Then the following dates would all be valid:; 8/ 8/1958; 08/08/1958; 8/8/1958

v: Use this option in conjunction with the %ddd tag to represent day of year in variable-width format. So the following date property:; %(ddd,v); represents values in the range 1 to 366. (If you omit the v option then the range of values would be 001 to 366.)

u: Use this option to render uppercase text on output.

w: Use this option to render lowercase text on output.

t: Use this option to render titlecase text (initial capitals) on output.

The u, w, and t options are mutually exclusive. They affect how text is formatted for output. Input dates will still be correctly interpreted regardless of case.

-N: Specify this option to left justify long day or month names so that the other elements in the date will be aligned.

+N: Specify this option to right justify long day or month names so that the other elements in the date will be aligned.

Names are left justified or right justified within a fixed width field of N characters (where N is between 1 and 99). Names will be truncated if necessary. The following are examples of justification in use:

%dd-%(mmmm,-5)-%yyyyy

21-Augus-2006

%dd-%(mmmm,-10)-%yyyyy

21-August    -2005

%dd-%(mmmm,+10)-%yyyyy

21-    August-2005

The locale for determining the setting of the day and month names can be controlled through the locale tag. This has the format:

%(L,'locale')

Where locale specifies the locale to be set using the language_COUNTRY.variant naming convention supported by ICU. See IBM® InfoSphere DataStage and QualityStage® Globalization Guide for a list of locales. The default locale for month names and weekday names markers is English unless overridden by a %L tag or the APT_IMPEXP_LOCALE environment variable (the tag takes precedence over the environment variable if both are set).

Use the locale tag in conjunction with your time format, for example the format string:

%(L,'es')%eeee, %dd %mmmm %yyyy

Specifies the Spanish locale and would result in a date with the following format:

miércoles, 21 septembre 2005

The format string is subject to the restrictions laid out in the following table. A format string can contain at most one tag from each row. In addition some rows are mutually incompatible, as indicated in the 'incompatible with' column. When some tags are used the format string requires that other tags are present too, as indicated in the 'requires' column.

Table 2. Format tag restrictions
Element	Numeric format tags	Text format tags	Requires	Incompatible with
year	%yyyy, %yy, %[nnnn]yy	-	-	-
month	%mm, %m	%mmm, %mmmm	year	week of year
day of month	%dd, %d	-	month	day of week, week of year
day of year	%ddd		year	day of month, day of week, week of year
day of week	%e, %E	%eee, %eeee	month, week of year	day of year
week of year	%WW		year	month, day of month, day of year

When a numeric variable-width input tag such as %d or %m is used, the field to the immediate right of the tag (if any) in the format string cannot be either a numeric tag, or a literal substring that starts with a digit. For example, all of the following format strings are invalid because of this restriction:

%d%m-%yyyy

%d%mm-%yyyy

%(d)%(mm)-%yyyy

%h00 hours

The year_cutoff is the year defining the beginning of the century in which all two-digit years fall. By default, the year cutoff is 1900; therefore, a two-digit year of 97 represents 1997.

You can specify any four-digit year as the year cutoff. All two-digit years then specify the next possible year ending in the specified two digits that is the same or greater than the cutoff. For example, if you set the year cutoff to 1930, the two-digit year 30 corresponds to 1930, and the two-digit year 29 corresponds to 2029.

On import and export, the year_cutoff is the base year.

This property is mutually exclusive with days_since, text, and julian.

You can include literal text in your date format. Any Unicode character other than null, backslash, or the percent sign can be used (although it is better to avoid control codes and other non-graphic characters). The following table lists special tags and escape sequences:

Tag	Escape sequence
%%	literal percent sign
\%	literal percent sign
\n	newline
\t	horizontal tab
\\	single backslash

date Uformat

The date uformat provides support for international components in date fields. It's syntax is:

String%macroString%macroString%macroString

where %macro is a date formatting macro such as %mmm for a 3-character English month. Only the String components of date uformat can include multi-byte Unicode characters.

Note: Any argument that has to be double quoted cannot be a field name or a local variable. An argument must have the data format of its type.

Function	Description
date date_from_days_since ( `int32` , " `date` " \| `format_variable` )	Returns date by adding the given integer to the baseline date. Converts an integer field into a date by adding the integer to the specified base `date`. The `date` must be in the format `yyyy`-`mm`-`dd` and must be either double quoted or a variable.
date date_from_julian_day( `uint32` )	Returns the date given a Julian day.
date date_from_string ( `string` , " `date_format` " \| `date_uformat` \| `format_variable` )	Returns a date from the given `string` formatted in the optional format specification. By default the `string` format is `yyyy`-`mm`-`dd`. For format descriptions, see and "date Uformat" .
date date_from_ustring ( `string` , " `date_format` " \| `date_uformat` \| `format_variable` )	Returns a date from the given `ustring` formatted in the optional format specification. By default the `ustring` format is `yyyy`-`mm`-`dd`. For format descriptions, see and "date Uformat" .
string string_from_date ( `date` , " `date_format` " \| `date_uformat` )	Converts the `date` to a string representation using the given format specification. By default the `ustring` format is `yyyy`-`mm`-`dd`. For format descriptions, see and "date Uformat" .
ustring ustring_from_date ( `date` , " `date_format` " \| `date_uformat` )	Converts the date to a ustring representation using the given format specification. By default the `ustring` format is `yyyy` - `mm` -`dd`. For format descriptions, see and "date Uformat" .
date date_from_timestamp( `timestamp` )	Returns the date from the given `timestamp` .
int32 days_since_from_date ( `date` , " `source_date` " \| `format_variable` )	Returns a value corresponding to the number of days from `source_date` to `date` . `source_date` must be in the form `yyyy` - `mm` -`dd` and must be double quoted or be a variable.
uint32 julian_day_from_date( `date` )	Returns a Julian date given the `date` .
int8 month_day_from_date( `date` )	Returns the day of the month given the `date` . For example, the date 07-23-2001 returns 23.
int8 month_from_date( `date` )	Returns the month from the given `date` . For example, the date 07-23-2001 returns 7.
date next_weekday_from_date ( `date` , " `day` " \| `format_variable` )	The value returned is the date of the specified day of the week soonest after `date` (including the `date` ). The `day` argument is optional. It is a string or variable specifying a day of the week. You can specify `day` by either the first three characters of the day name or the full day name. By default, the value is Sunday.
date previous_weekday_from_date ( `date` , " `day` " \| `format_variable` )	Returns the previous weekday date from `date` . The destination contains the closest date for the specified day of the week earlier than the source date (including the source date) The `day` argument is optional. It is a string or variable specifying a day of the week. You can specify `day` using either the first three characters of the day name or the full day name. By default, the value is Sunday.
int8 weekday_from_date ( `date` , " `origin_day` " \| `format_variable` )	Returns the day of the week from `date` . The optional argument `origin_day` is a string or variable specifying the day considered to be day zero of the week. You can specify the day using either the first three characters of the day name or the full day name. If omitted, Sunday is day zero.
int16 year_day_from_date( `date` )	Returns the day of the year (1-366) from `date` .
int16 year_from_date( `date` )	Returns the year from `date` . For example, the date 07-23-2001 returns 2001.
int8 year_week_from_date( `date` )	Returns the week of the year from `date` . For example, the date 07-23-2001 returns 30.

decimal and float Field Functions

You can do the following transformations using the decimal and float field functions.

Assign a decimal to an integer or float or numeric string, or compare a decimal to an integer or float or numeric string.
Specify an optional fix_zero argument (int8) to cause a decimal field containing all zeros to be treated as a valid zero.

Optionally specify a value for the rounding type (r_type) for many conversions. The values of r_type are:

ceil: Round the source field toward positive infinity. This mode corresponds to the IEEE 754 Round Up mode.
Examples: 1.4 -> 2, -1.6 -> -1
floor: Round the source field toward negative infinity. This mode corresponds to the IEEE 754 Round Down mode.
Examples: 1.6 -> 1, -1.4 -> -2
round_inf: Round or truncate the source field toward the nearest representable value, breaking ties by rounding positive values toward positive infinity and negative values toward negative infinity. This mode corresponds to the COBOL ROUNDED mode.
Examples: 1.4 -> 1, 1.5 -> 2, -1.4 -> -1, -1.5 -> -2

trunc_zero (default): Discard any fractional digits to the right of the right-most fractional digit supported in the destination, regardless of sign. For example, if the destination is an integer, all fractional digits are truncated. If the destination is another decimal with a smaller scale, round or truncate to the scale size of the destination decimal. This mode corresponds to the COBOL INTEGER-PART function.

Examples: 1.6 -> 1, -1.6 -> -1

Function	Description
decimal decimal_from_decimal ( `decimal` , " `r_type` " \| `format_variable` )	Returns decimal in decimal representation, changing the precision and scale according to the returned type. The rounding type, r_type, might be ceil, floor, round_inf, or trunc_zero as described above this table. The default rtype is trunc_zero.
decimal decimal_from_dfloat ( `dfloat` , " `r_type` " \| `format_variable` )	Returns `dfloat` in decimal representation. The rounding type, `r_type` , might be ceil, floor, round_inf, or trunc_zero as described above this table. The default is trunc_zero.
decimal decimal_from_string ( `string` , " `r_type` " \| `format_variable` )	Returns `string` in decimal representation. The rounding type, `r_type` , might be ceil, floor, round_inf, or trunc_zero as described above this table. The default is trunc_zero.
decimal decimal_from_ustring ( `ustring` , " `r_type` " \| `format_variable` )	Returns `ustring` in decimal representation. The rounding type, `r_type` , might be ceil, floor, round_inf, or trunc_zero as described above this table. The default is trunc_zero.
dfloat dfloat_from_decimal ( `decimal` , "fix-zero" \| `format_variable` )	Returns decimal in dfloat representation.
int32 int32_from_decimal ( `decimal` , "`r_type` fix_zero")	Returns int32 in decimal representation. The rounding type, `r_type` , might be ceil, floor, round_inf, or trunc_zero as described above this table. The default is trunc_zero.
int64 int64_from_decimal ( `decimal` , "`r_type` fix_zero")	Returns int64 in decimal representation. The rounding type, `r_type` , might be ceil, floor, round_inf, or trunc_zero as described above this table. The default is trunc_zero.
uint64 uint64_from_decimal ( `decimal` , "`r_type` fix_zero")	Returns uint64 in decimal representation. The rounding type, `r_type` , might be ceil, floor, round_inf, or trunc_zero as described above this table. The default is trunc_zero.
string string_from_decimal ( `decimal` , "fix_zero suppress_zero")	Returns string in decimal representation. fix_zero causes a decimal field containing all zeros to be treated as a valid zero. suppress_zero argument specifies that the returned ustring value will have no leading or trailing zeros. Examples: 000.100 -> 0.1; 001.000 -> 1; -001.100 -> -1.1
ustring ustring_from_decimal ( `decimal` , "fix_zero suppress_zero" \| `format_variable` )	Returns ustring in decimal representation. fix_zero causes a decimal field containing all zeros to be treated as a valid zero. suppress_zero argument specifies that the returned ustring value will have no leading or trailing zeros. Examples: 000.100 -> 0.1; 001.000 -> 1; -001.100 -> -1.1
string string_from_decimal ( `decimal` , "fix_zero suppress_zero" \| `format_variable` )	Returns ustring in decimal representation. fix_zero causes a decimal field containing all zeros to be treated as a valid zero. suppress_zero argument specifies that the returned ustring value will have no leading or trailing zeros. Examples: 000.100 -> 0.1; 001.000 -> 1; -001.100 -> -1.1
dfloat mantissa_from_dfloat( `dfloat` )	Returns the mantissa (the digits right of the decimal point) from `dfloat` .
dfloat mantissa_from_decimal ( `decimal` )	Returns the mantissa (the digits right of the decimal point) from `decimal` .

raw Field Functions

Use the raw field functions to transform a string into a raw data type and to determine the length of a raw value.

Function	Description
raw raw_from_string( `string` )	Returns `string` in raw representation.
raw u_raw_from_string( `ustring` )	Returns `ustring` in raw representation.
int32 raw_length( `raw` )	Returns the length of the raw field.

time and timestamp field functions

InfoSphere DataStage performs no automatic conversions to or from the time and timestamp data types. You must use the modify or transform operator if you want to convert a source or destination field. Most field conversions extract a portion of the time, such as hours or minutes, and write it into a destination field.

Time conversion to a numeric field can be used with any InfoSphere DataStage numeric data type. InfoSphere DataStage performs the necessary modifications to translate a conversion result to the numeric data type of the destination. For example, you can use the transformation function hours_from_time() to convert a time to an int8, or to an int16, int32, dfloat, and so on.

The string_from_time() and time_from_string() conversion functions take a format as a parameter of the conversion. The default format of the time in the string is hh:nn:ss. However, you can specify an optional format string defining another time format. The format string must contain a specification for hours, minutes, and seconds.

time Uformat

The time uformat provides support for international components in time fields. It's syntax is:

 String % macroString % macroString % macroString

where %macro is a time formatting macro such as %hh for a two-digit hour. See below for a description of the date format macros. Only the String components of time uformat can include multi-byte Unicode characters.

timestamp Uformat

This format is a concatenation of the date uformat and time uformat which are described in "date Uformat" and "time Uformat". The order of the formats does not matter, but the two formats cannot be mixed.

time Format

The possible components of the time_format string are given in the following table:

Table 3. Time format tags
Tag	Variable width availability	Description	Value range	Options
%h	import	Hour (24), variable width	0...23	s
%hh		Hour (24), fixed width	0...23	s
%H	import	Hour (12), variable width	1...12	s
%HH		Hour (12), fixed width	01...12	s
%n	import	Minutes, variable width	0...59	s
%nn		Minutes, fixed width	0...59	s
%s	import	Seconds, variable width	0...59	s
%ss		Seconds, fixed width	0...59	s
%s.`N`	import	Seconds + fraction (`N` = 0...6)	–	s, c, C
%ss.`N`		Seconds + fraction (`N` = 0...6)	–	s, c, C
%SSS	with v option	Milliseconds	0...999	s, v
%SSSSSS	with v option	Microseconds	0...999999	s, v
%aa	German	am/pm marker, locale specific	am, pm	u, w

By default, the format of the time contained in the string is %hh:%nn:%ss. However, you can specify a format string defining the format of the string field.

You must prefix each component of the format string with the percent symbol. Separate the string's components with any character except the percent sign (%).

Where indicated the tags can represent variable-fields on import, export, or both. Variable-width date elements can omit leading zeroes without causing errors.

The following options can be used in the format string where indicated:

s: Specify this option to allow leading spaces in time formats. The s option is specified in the form:; %(tag,s); Where tag is the format string. For example:; %(n,s); indicates a minute field in which values can contain leading spaces or zeroes and be one or two characters wide. If you specified the following date format property:; %(h,s):$(n,s):$(s,s); Then the following times would all be valid:; 20: 6:58; 20:06:58; 20:6:58

v: Use this option in conjunction with the %SSS or %SSSSSS tags to represent milliseconds or microseconds in variable-width format. So the time property:; %(SSS,v); represents values in the range 0 to 999. (If you omit the v option then the range of values would be 000 to 999.)

u: Use this option to render the am/pm text in uppercase on output.

w: Use this option to render the am/pm text in lowercase on output.

c: Specify this option to use a comma as the decimal separator in the %ss.N tag.

C: Specify this option to use a period as the decimal separator in the %ss.N tag.

The c and C options override the default setting of the locale.

The locale for determining the setting of the am/pm string and the default decimal separator can be controlled through the locale tag. This has the format:

%(L,'locale')

Where locale specifies the locale to be set using the language_COUNTRY.variant naming convention supported by ICU. See IBM InfoSphere DataStage and QualityStage Globalization Guide for a list of locales. The default locale for am/pm string and separators markers is English unless overridden by a %L tag or the APT_IMPEXP_LOCALE environment variable (the tag takes precedence over the environment variable if both are set).

Use the locale tag in conjunction with your time format, for example:

%L('es')%HH:%nn %aa

Specifies the Spanish locale.

Table 4. Format tag restrictions
Element	Numeric format tags	Text format tags	Requires	Incompatible with
hour	%hh, %h, %HH, %H	-	-	-
am/pm marker	-	%aa	hour (%HH)	hour (%hh)
minute	%nn, %n	-	-	-
second	%ss, %s	-	-	-
fraction of a second	%ss.N, %s.N, %SSS, %SSSSSS	-	-	-

Tag	Escape sequence
%%	literal percent sign
\%	literal percent sign
\n	newline
\t	horizontal tab
\\	single backslash

Function	Description
int8 hours_from_time( `time` )	Returns the hour portion of the given time.
int32 microseconds_from_time( `time` )	Returns the number of microseconds from the given time.
dfloat midnight_seconds_from_time ( `time` )	Returns the number of seconds from midnight to `time` .
int8 minutes_from_time( `time` )	Returns the number of minutes from `time` .
dfloat seconds_from_time( `time` )	Returns the number of seconds from time.
dfloat seconds_since_from_timestamp ( `timestamp` , " `source_timestamp_string` " \| `format_variable` )	Returns the number of seconds from `timestamp` to the base timestamp, or optionally the second timestamp argument for the number of seconds between timestamps. The `source_timestamp_string` argument must be double quoted or be a variable.
time time_from_midnight_seconds( `dfloat` )	Returns the time given the number of seconds ( `dfloat` ) since midnight.
time time_from_string ( `string` , `time_format` \| `time_uformat` \| `format_variable`)	Returns a time representation of `string` using the optional `time_format` , `time_uformat` , or `format_variable`. By default, the time format is `hh`:`nn`:`ss`. For format descriptions, see and "time Uformat" .
time time_from_ustring ( `ustring` , `time_format` \| `time_uformat` \| `format_variable`)	Returns a time representation of `ustring` using the optional `time_format` , `time_uformat` , or `format_variable` specification. By default, the time format is `hh`:`nn`:`ss`. For format descriptions, see and "time Uformat" .
string string_from_time ( `time` , " `time_format` " \| `format_variable` \| `time_uformat` )	Returns a string from `time`. The format argument is optional.The default time format is `hh`:`nn`:`ss`. For format descriptions, see and "time Uformat" .
string string_from_time ( `time` , " `time_format` " \| `format_variable` \| `time_format` )	Returns a ustring from `time` . The format argument is optional.The default time format is `hh`:`nn`:`ss`. For format descriptions, see and "time Uformat" .
time time_from_timestamp( `timestamp` )	Returns the time from `timestamp` .
date date_from_timestamp( `timestamp` )	Returns the date from the given `timestamp` .
timestamp timestamp_from_date_time ( `date` , `time` )	Returns a timestamp from `date` and `time` . The `date` specifies the date portion ( `yyyy` - `nn` - `dd` ) of the timestamp. The `time` argument specifies the time to be used when building the timestamp. The `time` argument must be in the `hh` : `nn` :ss format.
timestamp timestamp_from_seconds_since ( `dfloat` , " `original_timestamp_string` " \| `format_variable` )	Returns the timestamp from the number of seconds ( `dfloat` ) from the base timestamp or the `original_timestamp_string` argument. The `original_timestamp_string` must be double quoted or be a variable.
timestamp timestamp_from_string ( `string` , " `timestamp_format` " \| `timestamp_uformat` \| `format_variable`)	Returns a timestamp from string, in the optional `timestamp_format` `,` `timestamp_uformat`, or `format_variable` . The `timestamp_format` must be double quoted or be a variable. The default format is `yyyy` - `nn` - `dd` `hh` : `nn` : `ss` . `timestamp_format` is described in .
timestamp timestamp_from_ustring ( `ustring` , " `timestamp_format` " \| `timestamp_uformat` \| `format_variable`)	Returns a timestamp from `ustring` , in the optional format specification. The `timestamp_format` must be a double quoted string, a `uformat` `,` or a variable. The default format is `yyyy` - `nn` - `dd` `hh` : `nn` : `ss` . `timestamp_uformat` is described in .
string string_from_timestamp ( `timestamp` , " `timestamp_format` " \| `format_variable` )	Returns a string from `timestamp` . The formatting specification is optional. The default format is `yyyy` - `mm` - `dd` `hh` : `mm` : `ss` .
ustring ustring_from_timestamp ( `timestamp` , " `timestamp_format` " \| `format_variable` )	Returns a ustring from `timestamp` . The formatting specification is optional. The default format is `yyyy` - `mm` - `dd` `hh` : `mm` : `ss` .
timestamp timestamp_from_time ( `time` , `time_format` \| `time_uformat` )	Returns a timestamp from `time` . For format descriptions, see and "time Uformat"
date date_from_timestamp( `timestamp` )	Returns the date from the given `timestamp` .
timestamp timestamp_from_timet( `int32` )	Returns a timestamp from the given UNIX time_t representation ( `int32` ).
int32 timet_from_timestamp( `timestamp` )	Returns the UNIX time_t representation of `timestamp`.

null handling functions

lists the transformation functions for NULL handling.

All data types support nulls. As part of processing a record, an operator can detect a null and take the appropriate action, for example, it can omit the null field from a calculation or signal an error condition.

InfoSphere DataStage represents nulls in two ways.

It allocates a single bit to mark a field as null. This type of representation is called an out-of-band null.
It designates a specific field value to indicate a null, for example a numeric field's most negative possible value. This type of representation is called an in-band null. In-band null representation can be disadvantageous because you must reserve a field value for nulls and this value cannot be treated as valid data elsewhere.

The null-handling functions can change a null representation from an out-of-band null to an in-band null and from an in-band null to an out-of-band null.

Function	Description
destination_field handle_null ( `source_field` , `value` )	Change the `source_field` NULL representations from out-of-band representation to an in-band representation. The `value` field assigns the value that corresponds to NULL.
destination_field make_null ( `source_field` , `value` )	Changes `source_field` NULL representation from in-band NULL representation to out-of-band. The `value` field allows multiple valid NULL values to be inputted as arguments.
int8 notnull( `source_field` )	Returns 1 if `source_field` is not NULL, otherwise returns 0.
int8 null( `source_field` )	Returns 1 if `source_field` is NULL, otherwise returns 0.
set_null()	This function is used with "=" to set the left side output field, when it is nullable, to null. For example: a-field = set_null();
int8 is_dfloat_inband_null ( `dfloat` )	Returns 1 if `dfloat` is an inband null; otherwise it returns 0.
int8 is_int16_inband_null ( `int16` )	Returns 1 if `int16` is an inband null; otherwise it returns 0.
int8 is_int32_inband_null ( `int32` )	Returns 1 if `int32` is an inband null; otherwise it returns 0.
int8 is_int64_inband_null ( `int64` )	Returns 1 if `int64` is an inband null; otherwise it returns 0.
int8 is_sfloat_inband_null ( `sfloat` )	Returns 1 if `sfloat` is an inband null; otherwise it returns 0.
int8 is_string_inband_null ( `string` )	Returns 1 if `string` is an inband null; otherwise it returns 0.
int8 u_is_string_inband_null ( `ustring` )	Returns 1 if `ustring` is an inband null; otherwise it returns 0.

Note: Null-handling functions cannot be used for subrecord fields.

Mathematical functions

Function	Description
int32 abs( `int32` )	Returns the absolute value of `int32` .
dfloat acos( `dfloat` )	Returns the principal value of the arc cosine of `dfloat` .
dfloat asin( `dfloat` )	Returns the principal value of the arc sine of `dfloat` .
dfloat atan( `dfloat` )	Returns the principal value of the arc tangent of dfloat.
dfloat atan2( `dfloat` , `dfloat` )	Returns the principal value of the arc tangent of y/x (where y is the first argument).
dfloat ceil( `decimal` )	Returns the smallest dfloat value greater than or equal to `decimal` .
dfloat cos( `dfloat` )	Returns the cosine of the given angle ( `dfloat` ) expressed in radians.
dfloat cosh( `dfloat` )	Returns the hyperbolic cosine of `dfloat` .
dfloat exp( `dfloat` )	Returns the exponential of `dfloat`.
dfloat fabs( `dfloat` )	Returns the absolute value of `dfloat` .
dfloat floor( `decimal` )	Returns the largest dfloat value less than or equal to `decimal` .
dfloat ldexp( `dfloat` , `int32` )	Reconstructs `dfloat` out of the mantissa and exponent of `int32` .
uint64 llabs( `int64` )	Returns the absolute value of `int64` .
dfloat log( `dfloat` )	Returns the natural (base e) logarithm of `dfloat` .
dfloat log10( `dfloat` )	Returns the logarithm to the base 10 of `dfloat` .
int32 max( `int32` , `int32` )	Returns the larger of the two integers.
int32 min( `int32` , `int32` )	Returns the smaller of the two integers.
dfloat pow( `dfloat` , `dfloat` )	Returns the result of raising x (the first argument) to the power y (the second argument).
uint32 rand()	Returns a pseudo-random integer between 0 and 2³² - 1. The function uses a multiplicative congruential random-number generator with period 2³². See the UNIX man page for rand for more details.
uint32 random()	Returns a random integer between 0 and 2³¹ - 1. The function uses a nonlinear additive feedback random-number generator employing a default state array size of 31 long integers to return successive pseudo-random numbers. The period of this random-number generator is approximately 16 x (2³¹ - 1). Compared with rand, random is slower but more random. See the UNIX man page for random for more details.
dfloat sin( `dfloat` )	Returns the sine of `dfloat` expressed in radians.
dfloat sinh( `dfloat` )	Returns the hyperbolic sine of `dfloat`.
dfloat sqrt( `dfloat` )	Returns the square root of `dfloat` .
int32 quotient_from_dfloat ( `dfloat1` , `dfloat2`)	Returns the value of the quotient after `dfloat1` is divided by `dfloat2` .
srand( `uint32` )	Sets a new seed ( `uint32` ) for the frand() or srand() random number generator.
srandom( `uint32` )	Sets a random seed for the random() number generator. See the UNIX man page for srandom for more details.
dfloat tan( `dfloat` )	Returns the tangent of the given angle ( `dfloat` ) expressed in radians.
dfloat tanh( `dfloat` )	Returns the hyperbolic tangent of `dfloat` .

String field functions

Strings can be assigned (=), compared (==, <, >=, and so on), and concatenated (+) in the Transformation Language. In addition, the functions described in below are available for string manipulations, and the functions described in are available for ustring manipulations. When a long string is assigned to a short string, the long string is truncated to the length of the short string. The term white space refers to spaces, tabs, and any other blank space.

String Conversions and Lookup Tables

You can construct a string lookup table to use when default conversions do not yield satisfactory results. A string lookup table is a table of two columns and as many rows as are required to perform a conversion to or from a string as shown in the following table.

Numeric Value	String or Ustring
numVal1	string1 \| ustring1
numVal2	string2 \| ustring1
...	...
numVal3	string`n` \| ustring`n`

Each row of the lookup table specifies an association between a 16-bit integer or unsigned 32-bit integer value and a string or ustring. InfoSphere DataStage scans the Numeric Value or the String or Ustring column until it encounters the value or string to be translated. The output is the corresponding entry in the row.

The numeric value to be converted might be of the int16 or the uint32 data type. InfoSphere DataStage converts strings to values of the int16 or uint32 data type using the same table.

If the input contains a numeric value or string that is not listed in the table, InfoSphere DataStage operates as follows:

If a numeric value is unknown, an empty string is returned by default. However, you can set a default string value to be returned by the string lookup table.
If a string has no corresponding value, 0 is returned by default. However, you can set a default numeric value to be returned by the string lookup table.

A table definition defines the rows of a string or ustring lookup table and has the following form:

{propertyList} ('string' | 'ustring' = value; 'string' | 'ustring'= value; ... )

where:

propertyList is one or more of the following options; the entire list is enclosed in braces and properties are separated by commas if there are more than one:

case_sensitive: perform a case-sensitive search for matching strings; the default is case-insensitive.
default_value = defVal: the default numeric value returned for a string that does not match any of the strings in the table.
default_string = defString: the default string returned for numeric values that do not match any numeric value in the table.
string or ustring specifies a comma-separated list of strings or ustrings associated with value; enclose each string or ustring in quotes.

value specifies a comma-separated list of 16-bit integer values associated with string or ustring.

Function	Description
int8 is_alnum( `string` )	Returns 1 true if `string` consists entirely of alphanumeric characters.
int8 is_alpha( `string` )	Returns 1 true if `string` consists entirely of alphabetic characters.
int8 is_numeric( `string` )	Returns 1 true if `string` consists entirely of numeric characters, including decimal and sign.
int8 is_valid (" `type_string` ", " `value_string` ")	Returns 1 (true) if `value_string` is valid according to `type_string` , including NULL. The `type_string` argument is required. It must specify an InfoSphere DataStage schema data type. Integer types are checked to ensure the `value_string` is numeric (signed or unsigned), a whole number, and a valid value (for example, 1024 can not be assigned to an int8 type). Decimal types are checked to ensure the `value_string` is numeric (signed or unsigned) and a valid value. Float types are checked to ensure the `value_string` is numeric (signed or unsigned) and a valid value (exponent is valid). String is always valid with the NULL exception below. For all types, if the field cannot be set to NULL and the string is NULL, 0 (false) is returned. Date, time, and timestamp types are checked to ensure they are correct, using the optional format argument, and valid values. Raw cannot be checked since the input is a string.
int16 lookup_int16_from_string ( `string` , " `table_definition` " \| `table_variable` )	Returns an integer corresponding to `string` using `table_definition` string or variable. See "String Conversions and Lookup Tables" for more information.
string lookup_string_from_int16 ( `int16` , " `table_definition` " \| `table_variable` )	Returns a string corresponding to `int16` using `table_definition` string or variable. See "String Conversions and Lookup Tables" for more information.
string lookup_string_from_uint32 ( `uint32` , " `table_definition` " \| `table_variable` )	Returns a string corresponding to `uint32` using `table_definition` string or variable. See "String Conversions and Lookup Tables" for more information.
uint32 lookup_uint32_from_ string ( `string` , " `table_definition` " \| `table_variable` )	Returns an unsigned integer from `string` using `table_definition` string or variable.
string lower_case( `string` )	Converts `string` to lowercase. Non-alphabetic characters are ignored in the transformation.
string string_from_date ( `date` , " `date_format` " \| `format_variable` \| `date_uformat`)	Converts `date` to a string representation using the specified optional formatting specification. By default, the date format is `yyyy` - `mm` - `dd` . For format descriptions, see "date Uformat" .
string string_from_decimal ( `decimal` , "fix_zero suppress_zero`" \|` `format_variable` )	Returns a string from `decimal` . fix_zero causes a decimal field containing all zeros to be treated as a valid zero. suppress_zero argument specifies that the returned ustring value will have no leading or trailing zeros. Examples: 000.100 -> 0.1; 001.000 -> 1; -001.100 -> -1.1 The formatting specification is optional.
string string_from_time ( `time` ," `time_format` " \| `format_variable` \| `time_uformat`)	Returns a string from `time` . The format argument is optional.The default time format is `hh` : `nn` :`ss`. For format descriptions, see and "time Uformat" .
string string_from_timestamp ( `timestamp` , " `timestamp_format` " \| `format_variable` )	Returns a string from `timestamp` . The formatting specification is optional. The default format is `yyyy`-`mm`-`dd` `hh`:`mm`:`ss`.
string soundex ( `input_string` , `length` , `censusOption`)	Returns a string which represents the phonetic code for the string input word. Input words that produce the same code are considered phonetically equivalent. The empty string is returned if the input string is empty. `length` is an int8 and can be any value between 4 and 10. The default is 4. `censusOption` can be 0, 1, or 2 where 0 (the default) is enhanced soundex and not a census code; 1 are the normal census codes used in all censuses from 1920 on; and 2 are special census codes used intermittently in 1880, 1900, and 1910.
string upper_case( `string` )	Converts `string` to uppercase. Non-alphabetic characters are ignored in the transformation.
string compact_whitespace ( `string` )	Returns a `string` after reducing all consecutive white space in `string` to a single space.
string pad_string ( `string` , `pad_string` , `pad_length` )	Returns the string with the `pad_string` appended to the bounded length string for `pad_length` number of characters. `pad_length` is an int16. When the given `string` is a variable-length string, it defaults to a bounded-length of 1024 characters. If the given `string` is a fixed-length string, this function has no effect.
string strip_whitespace( `string` )	Returns `string` after stripping all white space in the string.
string trim_leading_trailing ( `string` )	Returns `string` after removing all leading and trailing white space.
string trim_leading( `string` )	Returns a `string` after removing all leading white space.
string trim_trailing( `string` )	Returns a `string` after removing all trailing white space.
int32 string_order_compare ( `string1` , `string2`, `justification`)	Returns a numeric value specifying the result of the comparison. The numeric values are: -1: string1 is less than string2 0: string1 is equal to string2 1: string1 is greater than string2 The string `justification` argument is either 'L' or 'R'. It defaults to 'L`'` if not specified. 'L' means a standard character comparison, left to right. 'R' means that any numeric substrings within the strings starting at the same position are compared as numbers. For example an 'R' comparison of "AB100" and "AB99" indicates that AB100 is great than AB99, since 100 is greater than 99. The comparisons are case sensitive.
string replace_substring ( `expression1` , `expression2`, `string`)	Returns a string value that contains the given `string` , with any characters in expression1 replaced by their corresponding characters in expression2. For example: replace_substring ("ABC:, "abZ", "AGDCBDA") returns "aGDZbDa", where any "A" gets replaced by "a", any "B" gets replaced by "b" and any "C" gets replaced by "Z". If `expression2` is longer than `expression1`, the extra characters are ignored. If `expression1` is longer than `expression2`, the extra characters in `expression1` are deleted from the given string (the corresponding characters are removed.) For example: replace_substring("ABC", "ab", "AGDCBDA") returns "aGDbDa".
int32 count_substring ( `string` , `substring` )	Returns the number of times that `substring` occurs in `string` . If `substring` is an empty string, the number of characters in `string` is returned.
int32 dcount_substring ( `string` , `delimiter`)	Returns the number of fields in `string` delimited by `delimiter` , where `delimiter` is a string. For example, dcount_substring("abcFdefFghi", "F") returns 3. If `delimiter` is an empty string, the number of characters in the string + 1 is returned. If delimiter is not empty, but does not exist in the given string, 1 is returned.
string double_quote_string ( `expression` )	Returns the given string `expression` enclosed in double quotes.
string substring_by_delimiter (string, `delimiter`, `occurrence`,`numsubstr`)	The `string` and `delimiter` arguments are string values, and the `occurrence` and `numsubstr` arguments are int32 values. This function returns `numsubstr` substrings from `string` , delimited by `delimiter` and starting at substring number `occurrence` . An example is: substring_by_delimiter ("abcFdefFghiFjkl", "F", 2, 2) The string "defFghi" is returned. If `occurrence` is < 1, then 1 is assumed. If `occurrence` does not point to an existing field, the empty string is returned. If `numsubstr` is not specified or is less than 1, it defaults to 1.
int32 index_of_substring ( `string` , `substring`, `occurrence`)	Returns the starting position of the nth occurrence of `substring` in `string` . The `occurrence` argument is an integer indicating the nth occurrence. If there is no nth occurrence or string doesn't contain any `substring` , -1 is returned. If `substring` is an empty string, -2 is returned.
string left_substring ( `string` , `length)`	Returns the first `length` characters of `string` . If `length` is 0, it returns the empty string. If `length` is greater than the length of the string, the entire `string` is returned.
string right_substring ( `string` , `length`)	Returns the last `length` characters of `string`. If `length` is 0, it returns the empty string. If `length` is greater than the length of `string` , the entire `string` is returned.
string string_of_space( `count`)	Returns a string containing `count` spaces. The empty string is returned for a `count` of 0 or less.
string single_quote_string ( `expression`)	Returns the `expression` string enclosed in single quotes.
string string_of_substring ( `string` , `count`)	Returns a `string` containing `count` occurrences of `string`. The empty string is returned for a count of 0 or less.
string trimc_string ( `string` [, `character` [, `option` ]])	If only `string` is specified, all leading and trailing spaces and tabs are removed, and all multiple occurrences of spaces and tabs are reduced to a single space or tab. If `string` and `character` are specified, option defaults to 'R' The available option values are: 'A' remove all occurrences of character 'B' remove both leading and trailing occurrences of character. 'D' remove leading, trailing, and redundant white-space characters. 'E' remove trailing white-space characters 'F' remove leading white-space characters 'L' remove all leading occurrences of character 'R' remove all leading, trailing, and redundant occurrences of character 'T' remove all trailing occurrences of character
string system_time_date()	Returns the current system time in this 24-hour format: `hh` : `mm` :`ss` `dd` : `mmm` : `yyyy`
int32 offset_of_substring ( `string` , `substring` , `position`)	Searches for the `substring` in the `string` beginning at character number `position` , where `position` is an uint32. Returns the starting position of the substring.
int8 string_case_compare ( `string` , `string`)	This is a case-insensitive version of string_compare() below.
int8 string_compare ( `string` , `string`)	Compares two strings and returns the index (0 or 1) of the greater string.
int8 string_num_case_compare ( `string` , `string`, `uint16`)	This is a case-insensitive version of string_num_compare() below.
string string_num_concatenate ( `string` , `string`, `uint16`)	Returns a string after appending `uint16` characters from the second string onto the first string.
int8 string_num_compare ( `string` , `string`, `uint16`)	Compares first `uint16` characters of two given strings and returns the index (0 or 1) of the greater string.
string string_num_copy ( `string` , `uint16`)	Returns the first `uint16` characters from the given string .
int32 string_length( `string` )	Returns the length of the string.
string substring ( `string` , `starting_position`, `length`)	Copies parts of strings to shorter strings by string extraction. The `starting_position` specifies the starting location of the substring; `length` specifies the substring length. The arguments starting_position and length are uint16 types and must be positive (>= 0).
string char_from_num( `int32` )	Returns an ASCII character from the given `int32` . If given a value that is not associated with a character such as -1, the function returns a space. An example use is: char_from_num(38) which returns "&"
int32 num_from_char( `string`)	Returns the numeric value of the ASCII-character in the string. When this function is given an empty string, it returns 0; and when it is given a multi-character string, it uses the first character in the string. An example use is: num_from_char("&") which returns 38.

Ustring field functions

InfoSphere DataStage provides the ustring type for multi-byte Unicode-character strings.ustrings can be assigned (=), compared (==, <, >=, and so on), and concatenated (+) in the Transformation Language. In addition, the functions described in are available for ustring manipulations. When a long string is assigned to a short string, the long string is truncated to the length of the short string. The term white space refers to spaces, tabs, and any other blank space.

Function	Description
ustring ustring_from_date ( `date` , " `date_format` " \| `date_format` \| `format_variable`)	Converts `date` to a ustring representation using the optional format specification. By default, the format is `yyyy` -`mm`-`dd`. For format descriptions, see and "date Uformat" .
ustring ustring_from_decimal ( `decimal` , "fix_zero suppress_zero" \| `format_variable` )	Returns a ustring from decimal. fix_zero causes a decimal field containing all zeros to be treated as a valid zero. suppress_zero argument specifies that the returned ustring value will have no leading or trailing zeros. Examples: 000.100 -> 0.1; 001.000 -> 1; -001.100 -> -1.1 The format specification is optional.
ustring ustring_from_time ( `time` , " `time_format` " \| `time_uformat` `\|` `format_variable`)	Returns a ustring from `time` using an optional format specification.The default time format is `hh`:`nn`:`ss`. For format descriptions, see and "time Uformat" .
ustring ustring_from_timestamp ( `timestamp` , " `timestamp_format` " \| `format_variable` )	Returns a ustring from `timestamp` . The format specification is optional. The default format is `yyyy`-`mm`-`dd` `hh`:`mm`:`ss`.
int8 u_is_alnum( `ustring` )	Returns 1 (true) if `ustring` consists entirely of alphanumeric characters.
int8 u_is_alpha( `ustring` )	Returns 1 (true) if `ustring` consists entirely of alphabetic characters.
int8 u_is_numeric( `ustring` )	Returns 1 (true) if `ustring` consists entirely of numeric characters, including decimal and sign.
int16 lookup_int16_from_ustring ( `ustring` , " `table_definition` " \| `table_variable` )	Returns an integer corresponding to `ustring` using `table_definition` string or variable. See "String Conversions and Lookup Tables" for more information.
ustring lookup_ustring_from_int16 ( `int16` , " `table_definition` " \| `table_variable` )	Returns a ustring corresponding to `int16` using `table_definition` string or `table_variable` . See "String Conversions and Lookup Tables" for more information.
ustring lookup_ustring_from_uint32 ( `uint32` , " `table_definition` " \| `table_variable` )	Returns a ustring corresponding to `uint32` using `table_definition` string or variable. See"String Conversions and Lookup Tables" for more information.
uint32 lookup_uint32_from_ustring ( `string` , " `table_definition` " \| `table_variable` )	Returns an unsigned integer from ustring using `table_definition` or `table_variable` .
int8 u_is_valid (" `type_ustring` ", " `value_ustring` ")	Returns 1 (true) if `value_ustring` is valid according to `type_ustring` , including NULL. The `type_ustring` argument is required. It must specify an InfoSphere DataStage schema data type. Integer types are checked to ensure the `value_ustring` is numeric (signed or unsigned), a whole number, and a valid value (for example, 1024 can not be assigned to an int8 type). Decimal types are checked to ensure the `value_ustring` is numeric (signed or unsigned) and a valid value. Float types are checked to ensure the `value_ustring` is numeric (signed or unsigned) and a valid value (exponent is valid). String is always valid with the NULL exception below. For all types, if the field cannot be set to NULL and the string is NULL, 0 (false) is returned. Date, time, and timestamp types are checked to ensure they are correct, using the optional format argument, and valid values. Raw cannot be checked since the input is a string.
ustring u_lower_case( `ustring` )	Converts `ustring` to lowercase. Non-alphabetic characters are ignored in the transformation.
ustring u_upper_case( `ustring` )	Converts `ustring` to uppercase. Non-alphabetic characters are ignored in the transformation.
ustring u_compact_whitespace ( `ustring` )	Returns the `ustring` after reducing all consecutive white space in `ustring` to a single space.
ustring u_pad_string ( `ustring` , `pad_ustring` , `pad_length` )	Returns the ustring with `pad_ustring` appended to the bounded length string for `pad_length` number of characters. `pad_length` is an int16. When the given `ustring` is a variable-length string, it defaults to a bounded-length of 1024 characters. If the given `ustring` is a fixed-length string, this function has no effect.
ustring u_strip_whitespace ( `ustring` )	Returns `ustring` after stripping all white space in the string.
ustring u_trim_leading_trailing ( `ustring` )	Returns `ustring` after removing all leading and trailing white space.
ustring u_trim_leading( `ustring` )	Returns `ustring` after removing all leading white space.
ustring u_trim_trailing ( `ustring` )	Returns a `ustring` after removing all trailing white space.
int32 u_string_order_compare ( `ustring1` , `ustring2`, `justification`)	Returns a numeric value specifying the result of the comparison. The numeric values are: -1: `ustring1` is less than `ustring2` 0: `ustring1` is equal to `ustring2` 1: `ustring1` is greater than `ustring2` The string justification argument is either 'L' or 'R'. It defaults to 'L' if not specified. 'L' means a standard character comparison, left to right. 'R' means that any numeric substrings within the strings starting at the same position are compared as numbers. For example an 'R' comparison of "AB100" and "AB99" indicates that AB100 is great than AB99, since 100 is greater than 99. The comparisons are case sensitive.
ustring u_replace_substring ( `expression1` , `expression2`, `ustring`)	Returns a ustring value that contains the given `ustring` , with any characters in `expression1` replaced by their corresponding characters in `expression2` . For example: u_replace_substring ("ABC", "abZ", "AGDCBDA") returns "aGDZbDa", where any "A" gets replaced by "a", any "B" gets replaced by "b" and any "C" gets replaced by "Z". If `expression2` is longer than `expression1` , the extra characters are ignored. If `expression1` is longer than `expression2`, the extra characters in `expression1` are deleted from the given string (the corresponding characters are removed.) For example: u_replace_substring("ABC", "ab", "AGDCBDA") returns "aGDbDa".
int32 u_count_substring ( `ustring` , `sub_ustring` )	Returns the number of times that `sub_ustring` occurs in `ustring` . If `sub_ustring` is an empty string, the number of characters in `ustring` is returned.
int32 u_dcount_substring ( `ustring` , `delimiter`)	Returns the number of fields in `ustring` delimited by `delimiter` , where `delimiter` is a string. For example, dcount_substring("abcFdefFghi", "F") returns 3. If `delimiter` is an empty string, the number of characters in the string + 1 is returned. If delimiter is not empty, but does not exist in the given string, 1 is returned.
ustring u_double_quote_string ( `expression`)	Returns the given ustring `expression` enclosed in double quotes.
ustring u_substring_by_delimiter ( `ustring, delimiter, occurrence, numsubstr`)	The `delimiter` argument is a ustring value, and the `occurrence` and `numsubstr` arguments are int32 values. This function returns `numsubstr` substrings from `ustring` , delimited by `delimiter` and starting at substring number `occurrence` . An example is: u_substring_by_delimiter ("abcFdefFghiFjkl", "F", 2, 2) The string "defFghi" is returned. If occurrence is < 1, then 1 is assumed. If `occurrence` does not point to an existing field, the empty string is returned. If `numsubstr` is not specified or is less than 1, it defaults to 1.
int32 u_index_of_substring ( `ustring, sub_ustring, occurrence` )	Returns the starting position of the nth occurrence of `sub_ustring` in `ustring`. The `occurrence` argument is an integer indicating the nth `occurrence` . If there is no nth occurrence, 0 is returned; if `sub_ustring` is an empty string, -2 is returned; and if `ustring` doesn't contain any `sub_ustring` , -1 is returned.
ustring u_left_substring ( `ustring` , `length`)	Returns the first `length` characters of `ustring`. If length is 0, it returns the empty string. If `length` is greater than the length of the `ustring` , the entire `ustring` is returned.
ustring u_right_substring ( `ustring` , `length`)	Returns the last `length` characters of `ustring` . If length is 0, it returns the empty string. If `length` is greater than the length of `ustring` , the entire `ustring` is returned.
ustring u_string_of_space( `count`)	Returns a ustring containing `count` spaces. The empty string is returned for a `count` of 0 or less.
ustring u_single_quote_string ( `expression`)	Returns `expression` enclosed in single quotes.
ustring u_string_of_substring ( `ustring` , `count`)	Returns a ustring containing `count` occurrences of `ustring`. The empty string is returned for a count of 0 or less.
ustring u_trimc_string ( `ustring` [, `character` [, `option` ]])	If only `ustring` is specified, all leading and trailing spaces and tabs are removed, and all multiple occurrences of spaces and tabs are reduced to a single space or tab. If ustring and character are specified, option defaults to 'R' The available option values are: 'A' remove all occurrences of character 'B' remove both leading and trailing occurrences of character. 'D' remove leading, trailing, and redundant white-space characters. 'E' remove trailing white-space characters 'F' remove leading white-space characters 'L' remove all leading occurrences of character 'R' remove all leading, trailing, and redundant occurrences of character 'T' remove all trailing occurrences of character
ustring u_system_time_date()	Returns the current system time in this 24-hour format: `hh`:`mm`:`ss` `dd`:`mmm`:`yyyy`
int32 u_offset_of_substring ( `ustring` , `sub_ustring` , `position`)	Searches for the `sub_ustring` in the `ustring` beginning at character number `position`, where position is an uint32. Returns the starting position of the substring.
int8 u_string_case_compare ( `ustring` , `ustring`)	This is a case-insensitive version of u_string_compare() below.
int8 u_string_compare ( `ustring` , `ustring`)	Compares two `ustrings` and returns the index (0 or 1) of the greater string.
int8 u_string_num_case_compare ( `ustring` , `ustring`, `uint16`)	This is a case-insensitive version of u_string_num_compare() below.
ustring u_string_num_concatenate ( `ustring` , `ustring`, `uint16`)	Returns a ustring after appending `uint16` characters from the second `ustring` onto the first `ustring` .
int8 u_string_num_compare ( `utring` , `ustring`, `uint16`)	Compares first `uint16` characters of two given ustrings and returns the index (0 or 1) of the greater `ustring` .
ustring u_string_num_copy ( `ustring` , `uint16`)	Returns the first `uint16` characters from the given `ustring` .
int32 u_string_length( `ustring` )	Returns the length of the `ustring` .
ustring u_substring ( `ustring` , `starting_position`, `length`)	Copies parts of ustrings to shorter strings by string extraction. The `starting_position` specifies the starting location of the substring; `length` specifies the substring length. The arguments `starting_position` and `length` are uint16 types and must be positive (>= 0).
ustring u_char_from_num( `int32` )	Returns a ustring character value from the given `int32` . If given a value that is not associated with a character such as -1, the function returns a space. An example use is: u_char_from_num(38) which returns "&"
int32 u_num_from_char( `ustring`)	Returns the numeric value of the character in the `ustring` . When this function is given an empty string, it returns 0; and when it is given a multi-character string, it uses the first character in the string. An example use is: u_num_from_char("&") which returns 38

Bit manipulation functions

Function	Description
string bit_expand ( `uint64`)	Expands the given `uint64` to a string containing the binary representation.
ustring u_bit_expand ( `uint64`)	Expands the given `uint64` to a ustring containing the binary representation.
uint64 bit_compress( `string` )	Converts the string binary representation to an `uint64` field.
uint64 u_bit_compress( `ustring` )	Converts the ustring binary representation to an `uint64` field.
uint64 set_bit ( `uint64` , `list_of_bits` , `bit_state`)	Turns the `uint64` bits that are listed by number in the string `list_of_bits` on or off, depending on whether the value of the `bit_state` integer is 1 or 0. `bit_state` is an optional argument, and has a default value of 1 which turns the list of bits on. An example use is: set_bit(0, "1,3,5,7") which returns 85.
uint64 u_set_bit ( `uint64` , `list_of_bits` , `bit_state`)	This function is a internationalized version of set_bit() above.

Job monitoring functions

The Job Monitor reports on the current state of a job and supplies information about the operators in your data flow. By default, it continually gathers information about your jobs, but it does not send the information unless you request it from your user-written client.

The information it supplies falls into four categories: job status, metadata, monitor, and summary, and it is given in XML notation. If you do not have job monitoring disabled with the top-level -nomonitor osh option, you can also obtain custom report and summary information about the transform operator using the functions in the table below.

There are six functions: three for the string type and three for the ustring type. The name_string or name_ustring argument can be used to specify a name for the custom information, the description_string or description_ustring argument can be used to describe the type of information, and the value_string or value_ustring argument can be used to give the details of the information.

Function	Description
string set_custom_summary_info ( `name_string` , `description_string`, `value_string`) ustring u_set_custom_summary_info ( `name_ustring` ,`description_ustring`, `value_ustring`)	Call this function in the finish code segment.
string send_custom_report ( `name_string` , `description_string`, `value_string`) ustring u_send_custom_report ( `name_ustring` , `description_ustring` , `value_ustring`)	Call this function in the initialize code segment.
string set_custom_instance_report ( `name_string` , `description_string`, `value_string`) ustring u_set_custom_instance_report ( `name_ustring` , `description_ustring`, `value_ustring`)	Call this function in the mainloop code segment.

Miscellaneous functions

The following table describes functions in the Transformation Language that do not fit into any of the above categories.

Function	Description
void force_error ( `error_message_string` )	Terminates the data flow when an error is detected, and prints `error_message_string` to stderr.
u_force_error ( `error_message_ustring` )	Terminates the data flow when an error is detected, and prints `error_message_ustring` to stderr.
string get_environment( `string` )	Returns the current value of `string` , a UNIX environment variable. The functionality is the same as the C getenv function.
ustring u_get_environment ( `ustring` )	Returns the current value of `utring` , a UNIX environment variable. The functionality is the same as the C getenv function.
int16 get_partition_num()	Returns the current partition number.
int16 get_num_of_partitions()	Returns the number of partitions.
void print_message ( `message_string` )	Prints `message_string` to stdout.
u_print_message ( `message_ustring` )	Prints `message_ustring` to stdout.
uint32 size_of( `value` )	Returns the actual size `value` when it is stored, not the length.

binary operators with decimal and numeric fields

The Transformation Language supports the placement of binary operators between two decimals and between a decimal and one of the numeric field types.

The binary operators are +, -, *, and /.

The numeric field types are int8, uint8, int16, uint16, int32, uint32, int64, uint64, sfloat, and dfloat.

Generally there are no restrictions on the form of an operand. It can be a constant, a variable, a function call that returns a numeric value, a simple expression such as an addition of two variables, or a complicated expression that consists of function calls as well as arithmetic operations.

By default, the transform operator sets the precision and scale of any temporary internal decimal variable created during arithmetic operations to:

[TRX_DEFAULT_MAX_PRECISION=38,
TRX_DEFAULT_MAX_SCALE=10]
with RoundMode equal to eRoundInf.

You can override the default values using these environment variables:

APT_DECIMAL_INTERM_PRECISION value
APT_DECIMAL_INTERM_SCALE value
APT_DECIMAL_INTERM_ROUNDMODE ceil | floor | round_inf | trunc_zero

Fatal errors might occur at runtime if the precision of the destination decimal is smaller than that of the source decimal

This topic is also in the IBM InfoSphere DataStage and QualityStage Parallel Job Advanced Developer's Guide.

Update timestamp

Last updated: 2012-10-8