Troubleshoot IBM DataStage

Use these solutions to help resolve problems that you might encounter with IBM® DataStage®.

Getting help and support for DataStage

If you have problems or questions when you use DataStage, you can get help by searching for information or by asking questions through a forum. You can also open a support ticket.

When you ask a question on the forums, tag your question so that it is seen by the DataStage development teams.

For questions about the service and getting started instructions, use the IBM developerWorks dW Answers Stack Overflow Launch icon forum. Include the “datastage” and “bluemix” tags. See Getting help for more details about using the forums.

Contents

DataStage sessions lock after you save changes

Note: This issue applies only to DataStage Version 4.0.1.
If you save changes to a DataStage job 30 seconds or more after you open it, the DataStage session locks after you save your changes. A message similar to the following example is shown:
User "admin" locked job "jobzero" on 3/1/22 7:30 PM

If you want to continue working on the job, you must close the editor and reopen the job.

To avoid this issue, you can take one of the following steps:
  • Migrate your version of DataStage to version 4.0.2 or later.
  • Ensure that IBM Cloud Pak® for Data is updated to version 4.0.7, then apply the following workaround.

Workaround:

Patch the configmap ds-route by completing the following steps:
  1. Log in to Red Hat Openshift Container Platform as a user with sufficient permissions to complete the task:
    oc login OpenShift_URL:port
  2. Save the current configmap ds-route:
    oc -n <cpd-namespace> get cm ds-route -o yaml > ds-route.yaml
  3. Edit the ds-route.yaml file to add proxy_set_header Referer ""; and proxy_hide_header WWW-Authenticate; to the location section for /ibm/iis/api/dscdesignerapi:
    location /ibm/iis/api/dscdesignerapi {
    proxy_set_header Host $http_host;
    proxy_pass https://is-servicesdocker:9446/ibm/iis/api/dscdesignerapi;
    proxy_read_timeout 30m;
    proxy_set_header Referer "";
    proxy_hide_header WWW-Authenticate;
    }
  4. Delete and re-create the configmap ds-route:
    oc -n <cpd-namespace> delete cm ds-route
    oc -n <cpd-namespace> apply -f ds-route.yaml

Issues browsing database tables with columns that contain special characters

You might have issues when you use the Asset Browser to browse database tables if the selected table contains a column with special characters such as ., $, or #, and you add that table into a DataStage flow. DataStage does not support column names that contain special characters. DataStage flows that reference columns with names that include these special characters will not work.

To work around this problem, create a view over the database table and redefine the column name in the view. For example:

create view view1 as select column1$ as column1, column2# as column2 ... from table

Then, when you use the Asset Browser, find the view and add it to the DataStage flow.

Incorrect inferences assigned to a schema read by the Asset Browser

The Asset Browser will read the first 1000 records and infer the schema, such as column name, length, data type, and nullable, based on these first 1000 records in the files in IBM Cloud Object Storage, Amazon S3, Google Cloud Storage, Azure File Storage, Azure Blob Storage, or the Azure Data Lake service. For instance, the Asset Browser might identify a column as an integer based on what is detected in the first 1000 records, however, later records in the file might show this column ought to be treated as varchar data type. Similarly, the Asset Browser might infer a column as varchar(20) even though later records show that the column ought to be varchar(100).

To resolve this issue:
  • Profile the source data to generate better metadata.
  • Change all columns to be varchar(1024) and gradually narrow down the data type.

Using sequential files as a source

To use sequential files as a source, you must load files into a project bucket in a specific location. To determine the project bucket location:
  1. Find the project Cloud Object Storage instance.
  2. In the project instance, find the bucket corresponding to the current project. The location is usually: <lowercase-project-name>-donotdelete-<random-string>

    For example: project2021mar01-donotdelete-pr-ifpkjcbk71s36j

    Then, upload the files to by specifying DataStage/files/ in the Prefix for object field.

Error running jobs with a parquet file format

You might receive the following error when you try to run a job with a parquet file format:
Error: CDICO9999E: Internal error occurred: Illegal 
state error: INTEGER(32,false) can only annotate INT32.
The unsigned 32-bit integer(uint32) and unsigned 64-bit integer(uint64) data types are not supported in the Parquet format that DataStage is using for all the file connectors. You must use supported data types.