Known issues and limitations for DataStage
The following known issues and limitations apply to DataStage.
- Known issues
-
- General
-
- Connection import through .zip files may be skipped
- REST API calls to DataStage services may fail during upgrades when PX Runtime pods are replaced
- Scale info cannot be added to timestamp data
- Canvas fails to open after upgrading to 4.7.x from 4.5.x or 4.6.x
- Data asset displays created time instead of last modified time
- Items in the selection dropdown for mapping input columns do not get selected in Firefox
- Files fail if connected by two or more headers to deleted files in file sets
- Migrating IADataRule: No more than two output links supported
- DataStage CR reports Failed status after upgrade from 4.5.0-4.5.2 to 4.7.x
- Five or more PXRuntime instances cannot be created without increasing operator memory
- Migrating Azure storage connector: Property read_mode for stage has an invalid value
- Stages
- Connectors
-
- Some connectors do not support flow connections
- Teradata (optimized) connection form shows SSL certificate field for all SSL modes
- Teradata (optimized) connection form shows "Certificate 1" section
- Test connection successful with invalid SSL certificate for the Teradata (optimized) connection
- Stored procedures inside a package are not supported in the Oracle connector
- Stored procedures inside a package require that the schema name be included in the Procedure name field in the Oracle connector
- Jobs with Exasol data fail after you upgrade
- Cannot export a relationship data asset in a job that has multiple IBM® Match 360 source connectors
- Salesforce.com (optimized) connection does not support vaults
- SCRAM-SHA-256 authentication method is not supported for the ODBC MongoDB data source
- Pipelines
-
- Existing jobs containing Run Bash script may break
- Redundant environments created on import
- Custom job run names are not supported in Pipelines
- The Encrypted data type is not supported in pipeline jobs
- Sequence job fails to import with error: Input name too long
- Non-ASCII characters are not recognized in several Watson Pipeline fields
- No more than 250 nodes can be used in an orchestration flow
- Limitations
Known issues
- Known issues for general areas:
-
- Connection import through .zip files may be skipped
-
Applies to: 4.7.3 and later
During .zip file import, if a connection is renamed and the contents are changed, it may still not be imported if it previously had the same name and contents as an existing connection.
- REST API calls to DataStage services may fail during upgrade
-
Applies to: 4.7.0 and later
Jobs may fail during upgrades when the PX Runtime pods processing the jobs are replaced. REST API calls made to DataStage services during upgrade may need to be retried.
- Scale info cannot be added to timestamp data
-
Applies to: 4.7.0 to 4.7.3
Fixed in: 4.7.4
The
scale
value cannot be added to data of type timestamp and is removed from migrated data.
- Canvas fails to open after upgrading to 4.7.x from 4.5.x or 4.6.x
-
Applies to: 4.7.0 and later
Canvas fails to open with a404
error after upgrading from 4.5.x or 4.6.x to 4.7.x. Workaround: Update the DataStage CR to change the value of the shutdown flag from"false"
tofalse
.oc -n ${PROJECT_CPD_INST_OPERANDS} patch datastage datastage --type=json --type merge -p '{"spec":{"shutdown": false }}'
- Data asset displays created time instead of last modified time
-
Applies to: 4.7.3
Data assets created with sequential file may not update the
Last modified
field when their metadata is modified. Note that the value ofLast modified
represents the time when the metadata of the asset was last modified, not the last modified time of the physical file.
- Items in the selection dropdown for mapping input columns do not get selected in Firefox
-
Applies to: 4.7.1 and 4.7.2
Fixed in: 4.7.3
In Firefox, the selection dropdown for mapping input columns does not select consistently. You may have to select more than once. Workaround: use Chrome.
- Files fail if connected by two or more headers to deleted files in file sets
-
Applies to: 4.7.1 and later
If a file in a file set is deleted, any files connected to the deleted file by two or more headers fail.
- Migrating IADataRule: No more than two output links supported
-
Applies to: 4.7.1 and later
Migrated flows with the Data Rule stage fail to run if the stage has more than two output links. Workaround: Remove all but two output links.
- DataStage CR reports Failed status after upgrade from 4.5.0-4.5.2 to 4.7.x
-
Applies to: 4.7.1 and later
In versions 4.5.0, 4.5.1, and 4.5.2, The 4.5.x DataStage operator remains active after upgrading to 4.7.x, causing the CR to report as Failed with an error message. Workaround: Uninstall the 4.5.x operator.# set the component - datastage_ent or datastage_ent_plus export COMPONENTS=datastage_ent # set the old operator namespace - ibm-common-services for express installs export OLD_OPERATOR_NS=ibm-common-services ./cpd-cli manage delete-olm-artifacts --cpd_operator_ns=${OLD_OPERATOR_NS} --components=${COMPONENTS}
- Five or more PXRuntime instances cannot be created without increasing operator memory
-
Applies to: 4.7.0 and later
To create five or more PXRuntime instances, a user must update the CSV and increase the memory limit of the operator pod. Workaround: Get the DataStage cluster service version in the operator namespace.
Patch the CSV to increase operator pod memory to 2Gi.oc -n ${PROJECT_CPD_OPS} get csv | grep ibm-cpd-datastage-operator
oc -n ${PROJECT_CPD_OPS} patch csv <datastage-csv-name> --type='json' -p='[{"path": "/spec/install/spec/deployments/0/spec/template/spec/containers/0/resources/limits/memory", "value": "2Gi", "op": "replace"}]'
- Migrating Azure storage connector: Property read_mode for stage has an invalid value
-
Applies to: 4.7.0 and later
Migrated flows with Azure storage connector fail to compile if usage property read_mode is set to Download, List containers/fileshares, or List files. The selected read mode will be unavailable.
- Known issues for stages:
-
- Transformer stage fails to compile with large amounts of generated code
-
Applies to: 4.7.2
Flows with large amounts of generated and nested code in the Transformer stage fail to compile due to resource limits. Workaround: Increase the PX-runtime resource limits.
- Known issues for connectors:
-
See also Known issues for Common core services for the issues that affect connectors that are used for other services in Cloud Pak for Data.
- Some connectors do not support flow connections
-
Applies to: 4.7.0 and later
The following connectors do NOT support the <Flow connection> option:
- IBM Cognos® Analytics
- IBM Data Virtualization Manager for z/OS®
- IBM Db2® on Cloud
- IBM Match 360
- IBM Watson® Query
- Microsoft Azure Data Lake Storage
- Storage volume
- Teradata (optimized) connection form shows SSL certificate field for all SSL modes
Applies to: 4.7.2 and later
The SSL certificate field applies only for these SSL modes:- Verify-CA (encrypted - verify CA)
- Verify-Full (encrypted - verify CA and hostname)
- Teradata (optimized) connection form shows "Certificate 1" section
Applies to: 4.7.2 and later
The Teradata (optimized) connection form shows a Certificate 1 section with a Secret field. This selection is visible regardless if you use secrets from a vault or paste in the certificate. You can ignore this section.
- Test connection successful with invalid SSL certificate for the Teradata (optimized) connection
Applies to: 4.7.2 and later
For the Teradata (optimized) connection, if you use an invalid SSL certificate that is signed by a public authority, the Test connection action is successful. However, the job will fail and the log will show an SSL certificate error.
- Stored procedures inside a package are not supported in the Oracle connector
Applies to: 4.7.0 and 4.7.1
Fixed in: 4.7.2
Calling stored procedures that are in a package from an Oracle database are not supported.
- Stored procedures inside a package require that the schema name be included in the Procedure name field in the Oracle connector
Applies to: 4.7.2
Fixed in: 4.7.3
If you are calling a stored procedure that is inside a package from an Oracle database, the schema name must be included in the Stage Procedure name field. For example,SCHEMA-NAME.PROCEDURE-NAME
.
- Jobs with Exasol data fail after you upgrade
Applies to: 4.7.0
Fixed in: 4.7.1
If you upgrade Cloud Pak for Data from version 4.6.x or earlier to 4.7.0, the execution of any existing jobs that use an Exasol connection will fail.Workaround: Recompile the job.
- Cannot export a relationship data asset in a job that has multiple IBM Match 360 source connectors
Applies to: 4.7.0
Fixed in: 4.7.1
If you have multiple Match 360 connectors for the source data in a DataStage job and any of those Match 360 connectors includes a relationship data asset, then instead of a relationship asset, the record and entity data are exported.
- Salesforce.com (optimized) connection does not support vaults
-
Applies to: 4.7.0 and later
The Input method Use secrets from a vault is not supported for the Salesforce.com (optimized) connection.
- SCRAM-SHA-256 authentication method is not supported for the ODBC MongoDB data source
-
Applies to: 4.7.0 and later
If you create an ODBC connection for a MongoDB data source that uses the SCRAM-SHA-256 authentication method (AM), the job will fail.
Workaround: Change the server-side authentication to SCRAM-SHA-1. Alternatively, use the MongoDB connection or the Generic JDBC connection.
- Known issues for DataStage in Pipelines
-
These known issues are DataStage-specific. For known issues in Pipelines not listed here, see Known issues and limitations for Watson Pipelines. For DataStage-specific limitations, see Migrating and constructing pipeline flows for DataStage.
Limitations
- Limitations for general areas:
-
- Project import and export does not retain the data in file sets and data sets
Applies to: 4.7.1 and later
Project-level import and exports do not package file set and data set data into the .zip file, so flows that use data sets and file sets will fail to run after export.
- File sets for data over 500 M are not exported
Applies to: 4.7.1 and later
If the size of the actual files in a file set is more than 500 M, no data will be stored in the exported zip file.
- Function libraries do not support the
const char*
return type Applies to: 4.7.1 and later
User-defined functions with theconst char*
return type are not supported.
- Status updates are delayed in completed job instances
Applies to: 4.7.0 and later
When multiple instances of the same job are run, some instances continue to display a "Running" status for 8-10 minutes after they have completed. Workaround: Use thedsjob jobrunclean
command and specify a job name and run-id to delete an active job.
- Node pool constraint is not supported
Applies to: 4.7.0 and later
The node pool constraint property is not supported.
- Reading FIFOs on persistent volumes across pods causes stages to hang
Applies to: 4.7.0 and later
Reading FIFOs on persistent volumes across pods is not supported and causes the stage reading the FIFO to hang. Workaround: Constrain the job to a single pod by setting APT_WLM_COMPUTE_PODS=1.
- Unassigned environment variables and parameter sets are not migrated
Applies to: 4.7.0 and later
Environment variables and parameter sets that have not been assigned a value will be skipped during export. When jobs are migrated, they contain only those environment variables and parameter sets that have been assigned a value for that job.
- Limitations for connectors:
-
- Previewing data and using the asset browser to browse metadata do not work for these connections:
-
Applies to: 4.7.0 and later
- Apache Cassandra (optimized)
- Apache HBase
- IBM MQ
- "Test connection" does not work for these connections:
-
Applies to: 4.7.0 and later
- Apache Cassandra (optimized)