Known issues and limitations for DataStage

The following known issues and limitations apply to DataStage.

Known issues

General

Connection import through .zip files may be skipped
REST API calls to DataStage services may fail during upgrades when PX Runtime pods are replaced
Scale info cannot be added to timestamp data
Canvas fails to open after upgrading to 4.7.x from 4.5.x or 4.6.x
Data asset displays created time instead of last modified time
Items in the selection dropdown for mapping input columns do not get selected in Firefox
Files fail if connected by two or more headers to deleted files in file sets
Migrating IADataRule: No more than two output links supported
DataStage CR reports Failed status after upgrade from 4.5.0-4.5.2 to 4.7.x
Five or more PXRuntime instances cannot be created without increasing operator memory
Migrating Azure storage connector: Property read_mode for stage has an invalid value

Stages

Transformer stage fails to compile with large amounts of generated code

Connectors

Some connectors do not support flow connections
Teradata (optimized) connection form shows SSL certificate field for all SSL modes
Teradata (optimized) connection form shows "Certificate 1" section
Test connection successful with invalid SSL certificate for the Teradata (optimized) connection
Stored procedures inside a package are not supported in the Oracle connector
Stored procedures inside a package require that the schema name be included in the Procedure name field in the Oracle connector
Jobs with Exasol data fail after you upgrade
Cannot export a relationship data asset in a job that has multiple IBM® Match 360 source connectors
Salesforce.com (optimized) connection does not support vaults
SCRAM-SHA-256 authentication method is not supported for the ODBC MongoDB data source

Pipelines

Existing jobs containing Run Bash script may break
Redundant environments created on import
Custom job run names are not supported in Pipelines
The Encrypted data type is not supported in pipeline jobs
Sequence job fails to import with error: Input name too long
Non-ASCII characters are not recognized in several Watson Pipeline fields
No more than 250 nodes can be used in an orchestration flow

Limitations

Limitations for general areas
Limitations for connectors

Known issues

Known issues for general areas:

Connection import through .zip files may be skipped

Applies to: 4.7.3 and later

During .zip file import, if a connection is renamed and the contents are changed, it may still not be imported if it previously had the same name and contents as an existing connection.

REST API calls to DataStage services may fail during upgrade

Applies to: 4.7.0 and later

Jobs may fail during upgrades when the PX Runtime pods processing the jobs are replaced. REST API calls made to DataStage services during upgrade may need to be retried.

Scale info cannot be added to timestamp data

Applies to: 4.7.0 to 4.7.3

Fixed in: 4.7.4

The scale value cannot be added to data of type timestamp and is removed from migrated data.

Canvas fails to open after upgrading to 4.7.x from 4.5.x or 4.6.x

Applies to: 4.7.0 and later

Canvas fails to open with a 404 error after upgrading from 4.5.x or 4.6.x to 4.7.x. Workaround: Update the DataStage CR to change the value of the shutdown flag from "false" to false.

oc -n ${PROJECT_CPD_INST_OPERANDS} patch datastage datastage --type=json --type merge -p '{"spec":{"shutdown": false }}'

Data asset displays created time instead of last modified time

Applies to: 4.7.3

Data assets created with sequential file may not update the Last modified field when their metadata is modified. Note that the value of Last modified represents the time when the metadata of the asset was last modified, not the last modified time of the physical file.

Items in the selection dropdown for mapping input columns do not get selected in Firefox

Applies to: 4.7.1 and 4.7.2

Fixed in: 4.7.3

In Firefox, the selection dropdown for mapping input columns does not select consistently. You may have to select more than once. Workaround: use Chrome.

Files fail if connected by two or more headers to deleted files in file sets

Applies to: 4.7.1 and later

If a file in a file set is deleted, any files connected to the deleted file by two or more headers fail.

Migrating IADataRule: No more than two output links supported

Applies to: 4.7.1 and later

Migrated flows with the Data Rule stage fail to run if the stage has more than two output links. Workaround: Remove all but two output links.

DataStage CR reports Failed status after upgrade from 4.5.0-4.5.2 to 4.7.x

Applies to: 4.7.1 and later

In versions 4.5.0, 4.5.1, and 4.5.2, The 4.5.x DataStage operator remains active after upgrading to 4.7.x, causing the CR to report as Failed with an error message. Workaround: Uninstall the 4.5.x operator.

# set the component - datastage_ent or datastage_ent_plus
export COMPONENTS=datastage_ent
# set the old operator namespace - ibm-common-services for express installs
export OLD_OPERATOR_NS=ibm-common-services
./cpd-cli manage delete-olm-artifacts --cpd_operator_ns=${OLD_OPERATOR_NS} --components=${COMPONENTS}

Five or more PXRuntime instances cannot be created without increasing operator memory

Applies to: 4.7.0 and later

To create five or more PXRuntime instances, a user must update the CSV and increase the memory limit of the operator pod. Workaround: Get the DataStage cluster service version in the operator namespace.

oc -n ${PROJECT_CPD_OPS} get csv | grep ibm-cpd-datastage-operator

Patch the CSV to increase operator pod memory to 2Gi.

oc -n ${PROJECT_CPD_OPS}  patch csv <datastage-csv-name> --type='json' -p='[{"path": "/spec/install/spec/deployments/0/spec/template/spec/containers/0/resources/limits/memory", "value": "2Gi", "op": "replace"}]'

Migrating Azure storage connector: Property read_mode for stage has an invalid value

Applies to: 4.7.0 and later

Migrated flows with Azure storage connector fail to compile if usage property read_mode is set to Download, List containers/fileshares, or List files. The selected read mode will be unavailable.

Known issues for stages:

Transformer stage fails to compile with large amounts of generated code

Applies to: 4.7.2

Flows with large amounts of generated and nested code in the Transformer stage fail to compile due to resource limits. Workaround: Increase the PX-runtime resource limits.

Known issues for connectors:

See also Known issues for Common core services for the issues that affect connectors that are used for other services in Cloud Pak for Data.

Some connectors do not support flow connections

Applies to: 4.7.0 and later

The following connectors do NOT support the <Flow connection> option:

IBM Cognos® Analytics
IBM Data Virtualization Manager for z/OS®
IBM Db2® on Cloud
IBM Match 360
IBM Watson® Query
Microsoft Azure Data Lake Storage
Storage volume

Teradata (optimized) connection form shows SSL certificate field for all SSL modes

Applies to: 4.7.2 and later

The SSL certificate field applies only for these SSL modes:

Verify-CA (encrypted - verify CA)
Verify-Full (encrypted - verify CA and hostname)

When you select the Input method Paste certificate text, the SSL certificate field shows for all SSL modes. If you select a different SSL mode, do not enter an SSL certificate.

Teradata (optimized) connection form shows "Certificate 1" section: Applies to: 4.7.2 and later
The Teradata (optimized) connection form shows a Certificate 1 section with a Secret field. This selection is visible regardless if you use secrets from a vault or paste in the certificate. You can ignore this section.

Test connection successful with invalid SSL certificate for the Teradata (optimized) connection: Applies to: 4.7.2 and later
For the Teradata (optimized) connection, if you use an invalid SSL certificate that is signed by a public authority, the Test connection action is successful. However, the job will fail and the log will show an SSL certificate error.

Stored procedures inside a package are not supported in the Oracle connector

Applies to: 4.7.0 and 4.7.1

Fixed in: 4.7.2

Calling stored procedures that are in a package from an Oracle database are not supported.

Stored procedures inside a package require that the schema name be included in the Procedure name field in the Oracle connector

Applies to: 4.7.2

Fixed in: 4.7.3

If you are calling a stored procedure that is inside a package from an Oracle database, the schema name must be included in the Stage Procedure name field. For example, SCHEMA-NAME.PROCEDURE-NAME.

Jobs with Exasol data fail after you upgrade

Applies to: 4.7.0

Fixed in: 4.7.1

If you upgrade Cloud Pak for Data from version 4.6.x or earlier to 4.7.0, the execution of any existing jobs that use an Exasol connection will fail.

Workaround: Recompile the job.

Cannot export a relationship data asset in a job that has multiple IBM Match 360 source connectors

Applies to: 4.7.0

Fixed in: 4.7.1

If you have multiple Match 360 connectors for the source data in a DataStage job and any of those Match 360 connectors includes a relationship data asset, then instead of a relationship asset, the record and entity data are exported.

Salesforce.com (optimized) connection does not support vaults

Applies to: 4.7.0 and later

The Input method Use secrets from a vault is not supported for the Salesforce.com (optimized) connection.

SCRAM-SHA-256 authentication method is not supported for the ODBC MongoDB data source

Applies to: 4.7.0 and later

If you create an ODBC connection for a MongoDB data source that uses the SCRAM-SHA-256 authentication method (AM), the job will fail.

Workaround: Change the server-side authentication to SCRAM-SHA-1. Alternatively, use the MongoDB connection or the Generic JDBC connection.

Known issues for DataStage in Pipelines

These known issues are DataStage-specific. For known issues in Pipelines not listed here, see Known issues and limitations for Watson Pipelines. For DataStage-specific limitations, see Migrating and constructing pipeline flows for DataStage.

Existing jobs containing Run Bash script may break

Applies to: 4.7.1 and later

Due to a fix made in 4.7.1, trailing \n characters are no longer removed from the output of Run Bash script. Jobs made in 4.7.0 may break due to the fix. Workaround: See Known issues and limitations for Watson Pipelines.

Redundant environments created on import

Applies to: 4.7.0 and later

Migration adds environments as both local parameters and environments with a $ sign on the Run DataStage job or Run Pipeline job node. Workaround: Remove the redundant environments.

Custom job run names are not supported in Pipelines

Applies to: 4.7.0 and later

Specific pipeline job runs cannot be given names because DSJobInvocationID is not supported in pipeline jobs.

The Encrypted data type is not supported in pipeline jobs

Applies to: 4.7.0 and later

Local parameters of the Encrypted type are not supported in Pipelines.

Sequence job fails to import with error: Input name too long

Applies to: 4.7.0

Fixed in: 4.7.1

Sequence jobs fail to import if any user variable names and parameter names are longer than 36 bytes. For successfully migrated or new pipeline jobs, names longer than 50 bytes result in an error.

Non-ASCII characters are not recognized in several Watson Pipeline fields

Applies to: 4.7.0 and later

In the following fields in Watson Pipeline, non-ASCII characters cannot be used:

Pipeline/job parameter name
User variable name
Environment variable name
Output variable name
Email address

No more than 250 nodes can be used in an orchestration flow: Applies to: 4.7.0 and later
An orchestration flow that contains more than 250 nodes will not work. 250 is also the maximum number of loop iterations that a pipeline will run.

Limitations

Limitations for general areas:

Project import and export does not retain the data in file sets and data sets: Applies to: 4.7.1 and later
Project-level import and exports do not package file set and data set data into the .zip file, so flows that use data sets and file sets will fail to run after export.

File sets for data over 500 M are not exported: Applies to: 4.7.1 and later
If the size of the actual files in a file set is more than 500 M, no data will be stored in the exported zip file.

Function libraries do not support the const char* return type: Applies to: 4.7.1 and later
User-defined functions with the const char* return type are not supported.

Status updates are delayed in completed job instances: Applies to: 4.7.0 and later
When multiple instances of the same job are run, some instances continue to display a "Running" status for 8-10 minutes after they have completed. Workaround: Use the dsjob jobrunclean command and specify a job name and run-id to delete an active job.

Node pool constraint is not supported: Applies to: 4.7.0 and later
The node pool constraint property is not supported.

Reading FIFOs on persistent volumes across pods causes stages to hang: Applies to: 4.7.0 and later
Reading FIFOs on persistent volumes across pods is not supported and causes the stage reading the FIFO to hang. Workaround: Constrain the job to a single pod by setting APT_WLM_COMPUTE_PODS=1.

Unassigned environment variables and parameter sets are not migrated: Applies to: 4.7.0 and later
Environment variables and parameter sets that have not been assigned a value will be skipped during export. When jobs are migrated, they contain only those environment variables and parameter sets that have been assigned a value for that job.

Limitations for connectors:

Previewing data and using the asset browser to browse metadata do not work for these connections:

Applies to: 4.7.0 and later

Apache Cassandra (optimized)
Apache HBase
IBM MQ

"Test connection" does not work for these connections:

Applies to: 4.7.0 and later

Apache Cassandra (optimized)