Parallel Engine Transformer NUM() and IsValid() functions have different behavior on Windows and UNIX for certain numeric strings containing d or D.

Troubleshooting

Problem

Symptom

A customer recently discovered a difference in the behavior of the transformer NUM() and IsValid() functions on Windows versus UNIX. On Windows these functions can treat some strings containing the characters d or D as decimal floating point numbers. Here is an explanation using the NUM() function:

BEHAVIOR

The customer’s job was using the NUM() function to test that sequential file input data was numeric. Here is the description of the NUM() function in the Knowledge Center:

https://www-01.ibm.com/support/knowledgecenter/SSZJPZ_11.3.0/com.ibm.swg.im.iis.ds.parjob.dev.doc/topics/r_deeref_String_Functions.html

Num
Returns 1 if the string can be converted to a number, or returns 0 otherwise.

· Input: string (string)

· Output: result (int32)

· Examples. If mylink.mystring1 contains the string "22", then the following function returns the value 1.

Num(mylink.mystring1)

If mylink.mystring1 contains the string "twenty two", then the following function returns the value 0.
Num(mylink.mystring1)

The customer found this surprising result when running a job on Windows:

NUM('7100D027') returns 1.

However, on UNIX, NUM('7100D027') returns 0.

EXPLANATION

The PX implementation of the NUM function uses the C library function strtod().

The strtod()function has different definition of a valid number on Windows than the implementation on UNIX. Here is a link to the documentation of UNIX strtod() :

http://www.cplusplus.com/reference/cstdlib/strtod/

and here is a link to the Windows strtod() :

https://msdn.microsoft.com/en-us/library/kxsfc1ab%28v=vs.100%29.aspx

On Unix the definition of a floating decimal number is:

· decimal floating point expression. It consists of the following parts:

· (optional) plus or minus sign

· nonempty sequence of decimal digits optionally containing decimal point character (as determined by the current C locale) (defines significand)

· (optional) e or E followed with optional minus or plus sign and nonempty sequence of decimal digits (defines exponent)

On Windows, a floating decimal decimal a floating decimal decimal can have d or D in the middle of a string in addition to e or E

So, on Windows “7100D027” is a legal floating point number and NUM returns 1.

On Unix, “7100D027” is a string of digits with a D in the middle, and NUM returns 0.

This difference in behavior has been in the parallel engine since the Windows version was released in 2004 and no customer reported it as an issue until 2015.

[{"Product":{"code":"SSVSEF","label":"IBM InfoSphere DataStage"},"Business Unit":{"code":"BU048","label":"IBM Software"},"Component":"Not Applicable","Platform":[{"code":"PF033","label":"Windows"}],"Version":"9.1;8.5;8.1;8.0;7.5.3;11.5;11.3","Edition":"All Editions","Line of Business":{"code":"LOB76","label":"Data Platform"}}]

Log InLog in to view more of this document

This document has the abstract of a technical article that is available to authorized users once you have logged on. Please use Log in button above to access the full document. After log in, if you do not have the right authorization for this document, there will be instructions on what to do next.

Tips

Parallel Engine Transformer NUM() and IsValid() functions have different behavior on Windows and UNIX for certain numeric strings containing d or D.

Troubleshooting

Problem

Symptom

Log InLog in to view more of this document

Was this topic helpful?

Document Information

UID

Share your feedback

Need support?