Known issues and limitations for DataStage

The following known issues and limitations apply to DataStage.

Known issues
General
Stages
Connectors
Pipelines
Limitations

Known issues

Known issues for general areas:

Connection import through .zip files may be skipped

Applies to: 4.7.3 and later

During .zip file import, if a connection is renamed and the contents are changed, it may still not be imported if it previously had the same name and contents as an existing connection.

REST API calls to DataStage services may fail during upgrade

Applies to: 4.7.0 and later

Jobs may fail during upgrades when the PX Runtime pods processing the jobs are replaced. REST API calls made to DataStage services during upgrade may need to be retried.

Scale info cannot be added to timestamp data

Applies to: 4.7.0 to 4.7.3

Fixed in: 4.7.4

The scale value cannot be added to data of type timestamp and is removed from migrated data.

Canvas fails to open after upgrading to 4.7.x from 4.5.x or 4.6.x

Applies to: 4.7.0 and later

Canvas fails to open with a 404 error after upgrading from 4.5.x or 4.6.x to 4.7.x. Workaround: Update the DataStage CR to change the value of the shutdown flag from "false" to false.
oc -n ${PROJECT_CPD_INST_OPERANDS} patch datastage datastage --type=json --type merge -p '{"spec":{"shutdown": false }}'
Data asset displays created time instead of last modified time

Applies to: 4.7.3

Data assets created with sequential file may not update the Last modified field when their metadata is modified. Note that the value of Last modified represents the time when the metadata of the asset was last modified, not the last modified time of the physical file.

Items in the selection dropdown for mapping input columns do not get selected in Firefox

Applies to: 4.7.1 and 4.7.2

Fixed in: 4.7.3

In Firefox, the selection dropdown for mapping input columns does not select consistently. You may have to select more than once. Workaround: use Chrome.

Files fail if connected by two or more headers to deleted files in file sets

Applies to: 4.7.1 and later

If a file in a file set is deleted, any files connected to the deleted file by two or more headers fail.

Migrating IADataRule: No more than two output links supported

Applies to: 4.7.1 and later

Migrated flows with the Data Rule stage fail to run if the stage has more than two output links. Workaround: Remove all but two output links.

DataStage CR reports Failed status after upgrade from 4.5.0-4.5.2 to 4.7.x

Applies to: 4.7.1 and later

In versions 4.5.0, 4.5.1, and 4.5.2, The 4.5.x DataStage operator remains active after upgrading to 4.7.x, causing the CR to report as Failed with an error message. Workaround: Uninstall the 4.5.x operator.
# set the component - datastage_ent or datastage_ent_plus
export COMPONENTS=datastage_ent
# set the old operator namespace - ibm-common-services for express installs
export OLD_OPERATOR_NS=ibm-common-services
./cpd-cli manage delete-olm-artifacts --cpd_operator_ns=${OLD_OPERATOR_NS} --components=${COMPONENTS}
Five or more PXRuntime instances cannot be created without increasing operator memory

Applies to: 4.7.0 and later

To create five or more PXRuntime instances, a user must update the CSV and increase the memory limit of the operator pod. Workaround: Get the DataStage cluster service version in the operator namespace.
oc -n ${PROJECT_CPD_OPS} get csv | grep ibm-cpd-datastage-operator
Patch the CSV to increase operator pod memory to 2Gi.
oc -n ${PROJECT_CPD_OPS}  patch csv <datastage-csv-name> --type='json' -p='[{"path": "/spec/install/spec/deployments/0/spec/template/spec/containers/0/resources/limits/memory", "value": "2Gi", "op": "replace"}]'
Migrating Azure storage connector: Property read_mode for stage has an invalid value

Applies to: 4.7.0 and later

Migrated flows with Azure storage connector fail to compile if usage property read_mode is set to Download, List containers/fileshares, or List files. The selected read mode will be unavailable.

Known issues for stages:

Transformer stage fails to compile with large amounts of generated code

Applies to: 4.7.2

Flows with large amounts of generated and nested code in the Transformer stage fail to compile due to resource limits. Workaround: Increase the PX-runtime resource limits.

Known issues for connectors:

See also Known issues for Common core services for the issues that affect connectors that are used for other services in Cloud Pak for Data.


Some connectors do not support flow connections

Applies to: 4.7.0 and later

The following connectors do NOT support the <Flow connection> option:

  • IBM Cognos® Analytics
  • IBM Data Virtualization Manager for z/OS®
  • IBM Db2® on Cloud
  • IBM Match 360
  • IBM Watson® Query
  • Microsoft Azure Data Lake Storage
  • Storage volume
Teradata (optimized) connection form shows SSL certificate field for all SSL modes

Applies to: 4.7.2 and later

The SSL certificate field applies only for these SSL modes:
  • Verify-CA (encrypted - verify CA)
  • Verify-Full (encrypted - verify CA and hostname)
When you select the Input method Paste certificate text, the SSL certificate field shows for all SSL modes. If you select a different SSL mode, do not enter an SSL certificate.
Teradata (optimized) connection form shows "Certificate 1" section

Applies to: 4.7.2 and later

The Teradata (optimized) connection form shows a Certificate 1 section with a Secret field. This selection is visible regardless if you use secrets from a vault or paste in the certificate. You can ignore this section.
Test connection successful with invalid SSL certificate for the Teradata (optimized) connection

Applies to: 4.7.2 and later

For the Teradata (optimized) connection, if you use an invalid SSL certificate that is signed by a public authority, the Test connection action is successful. However, the job will fail and the log will show an SSL certificate error.
Stored procedures inside a package are not supported in the Oracle connector

Applies to: 4.7.0 and 4.7.1

Fixed in: 4.7.2

Calling stored procedures that are in a package from an Oracle database are not supported.

Stored procedures inside a package require that the schema name be included in the Procedure name field in the Oracle connector

Applies to: 4.7.2

Fixed in: 4.7.3

If you are calling a stored procedure that is inside a package from an Oracle database, the schema name must be included in the Stage Procedure name field. For example, SCHEMA-NAME.PROCEDURE-NAME.
Jobs with Exasol data fail after you upgrade

Applies to: 4.7.0

Fixed in: 4.7.1

If you upgrade Cloud Pak for Data from version 4.6.x or earlier to 4.7.0, the execution of any existing jobs that use an Exasol connection will fail.

Workaround: Recompile the job.

Cannot export a relationship data asset in a job that has multiple IBM Match 360 source connectors

Applies to: 4.7.0

Fixed in: 4.7.1

If you have multiple Match 360 connectors for the source data in a DataStage job and any of those Match 360 connectors includes a relationship data asset, then instead of a relationship asset, the record and entity data are exported.

Salesforce.com (optimized) connection does not support vaults

Applies to: 4.7.0 and later

The Input method Use secrets from a vault is not supported for the Salesforce.com (optimized) connection.

SCRAM-SHA-256 authentication method is not supported for the ODBC MongoDB data source

Applies to: 4.7.0 and later

If you create an ODBC connection for a MongoDB data source that uses the SCRAM-SHA-256 authentication method (AM), the job will fail.

Workaround: Change the server-side authentication to SCRAM-SHA-1. Alternatively, use the MongoDB connection or the Generic JDBC connection.

Known issues for DataStage in Pipelines

These known issues are DataStage-specific. For known issues in Pipelines not listed here, see Known issues and limitations for Watson Pipelines. For DataStage-specific limitations, see Migrating and constructing pipeline flows for DataStage.

Existing jobs containing Run Bash script may break

Applies to: 4.7.1 and later

Due to a fix made in 4.7.1, trailing \n characters are no longer removed from the output of Run Bash script. Jobs made in 4.7.0 may break due to the fix. Workaround: See Known issues and limitations for Watson Pipelines.

Redundant environments created on import

Applies to: 4.7.0 and later

Migration adds environments as both local parameters and environments with a $ sign on the Run DataStage job or Run Pipeline job node. Workaround: Remove the redundant environments.

Custom job run names are not supported in Pipelines

Applies to: 4.7.0 and later

Specific pipeline job runs cannot be given names because DSJobInvocationID is not supported in pipeline jobs.

The Encrypted data type is not supported in pipeline jobs

Applies to: 4.7.0 and later

Local parameters of the Encrypted type are not supported in Pipelines.

Sequence job fails to import with error: Input name too long

Applies to: 4.7.0

Fixed in: 4.7.1

Sequence jobs fail to import if any user variable names and parameter names are longer than 36 bytes. For successfully migrated or new pipeline jobs, names longer than 50 bytes result in an error.

Non-ASCII characters are not recognized in several Watson Pipeline fields

Applies to: 4.7.0 and later

In the following fields in Watson Pipeline, non-ASCII characters cannot be used:
  • Pipeline/job parameter name
  • User variable name
  • Environment variable name
  • Output variable name
  • Email address
No more than 250 nodes can be used in an orchestration flow

Applies to: 4.7.0 and later

An orchestration flow that contains more than 250 nodes will not work. 250 is also the maximum number of loop iterations that a pipeline will run.

Limitations

Limitations for general areas:

Project import and export does not retain the data in file sets and data sets

Applies to: 4.7.1 and later

Project-level import and exports do not package file set and data set data into the .zip file, so flows that use data sets and file sets will fail to run after export.
File sets for data over 500 M are not exported

Applies to: 4.7.1 and later

If the size of the actual files in a file set is more than 500 M, no data will be stored in the exported zip file.
Function libraries do not support the const char* return type

Applies to: 4.7.1 and later

User-defined functions with the const char* return type are not supported.
Status updates are delayed in completed job instances

Applies to: 4.7.0 and later

When multiple instances of the same job are run, some instances continue to display a "Running" status for 8-10 minutes after they have completed. Workaround: Use the dsjob jobrunclean command and specify a job name and run-id to delete an active job.
Node pool constraint is not supported

Applies to: 4.7.0 and later

The node pool constraint property is not supported.
Reading FIFOs on persistent volumes across pods causes stages to hang

Applies to: 4.7.0 and later

Reading FIFOs on persistent volumes across pods is not supported and causes the stage reading the FIFO to hang. Workaround: Constrain the job to a single pod by setting APT_WLM_COMPUTE_PODS=1.
Unassigned environment variables and parameter sets are not migrated

Applies to: 4.7.0 and later

Environment variables and parameter sets that have not been assigned a value will be skipped during export. When jobs are migrated, they contain only those environment variables and parameter sets that have been assigned a value for that job.
Limitations for connectors:

Previewing data and using the asset browser to browse metadata do not work for these connections:

Applies to: 4.7.0 and later

  • Apache Cassandra (optimized)
  • Apache HBase
  • IBM MQ
"Test connection" does not work for these connections:

Applies to: 4.7.0 and later

  • Apache Cassandra (optimized)