Troubleshoot IBM DataStage

Here are the answers to common troubleshooting questions about using IBM® DataStage®.

Getting help and support for DataStage

If you have problems or questions when using DataStage, you can get help by searching for information or by asking questions through a forum. You can also open a support ticket.

When using the forums to ask a question, tag your question so that it is seen by the DataStage development teams.

If you have technical questions about deep learning, post your question on Stack Overflow Launch icon and tag your question with “ibm-bluemix” and “datastage”.

For questions about the service and getting started instructions, use the IBM developerWorks dW Answers Stack Overflow Launch icon forum. Include the “datastage” and “bluemix” tags. See Getting help for more details about using the forums.

Contents

Issues browsing database tables with columns that contain special characters

You might have issues using the Asset Browser to browse database tables if the selected table contains a column with special characters such as ., $, or #, and you add that table into a DataStage flow. DataStage does not support column names that contain special characters. DataStage flows that reference columns with names that include these special characters will not work.

To work around this problem, create a view on top of the database table and redefine the column name in the view. For example:

create view view1 as select column1$ as column1, column2# as column2 ... from table

Then when using the Asset Browser, find the view and add it to the DataStage flow.

Incorrect inferences assigned to a schema read by the Asset Browser

The Asset Browser will read the first 1000 records and infer the schema, such as column name, length, data type, nullable, and so on, based on these first 1000 records in the files in IBM Cloud Object Storage, Amazon S3, Google Cloud Storage, Azure File Storage, Azure Blob Storage, or the Azure Data Lake service. For instance, the Asset Browser might identify a column as an integer based on what is detected in the first 1000 records, however, later records in the file might show this column should be treated as varchar data type. Similarly, the Asset Browser might infer a column as varchar(20) even though later records show that the column should be varchar(100).

To resolve this issue:
  • Profile the source data to generate better metadata.
  • Change all columns to be varchar(1024) and gradually narrow down the data type.

Using sequential files as a source

To use sequential files as a source, you must load files into a project bucket in a specific location. To determine the project bucket location:
  1. Find the project COS instance.
  2. In the project instance, find the bucket corresponding to the current project. The location is usually: <lowercase-project-name>-donotdelete-<random-string>

    For example: project2021mar01-donotdelete-pr-ifpkjcbk71s36j

    Then, upload the files to by specifying DataStage/files/ in the Prefix for object field.

Error running jobs with a parquet file format

You might receive the following error when trying to run a job with a parquet file format:
Error: CDICO9999E: Internal error occurred: Illegal 
state error: INTEGER(32,false) can only annotate INT32.
The unsigned 32 bit integer(uint32) and unsigned 64 bit integer(uint64) data types are not supported in the Parquet format that DataStage is using for all the file connectors. You must use supported data types.