Table of contents

Troubleshooting resource issues

Troubleshoot and resolve resource issues when running analysis jobs.

If an issue applies to a release, it is indicated with a check mark (✓) in the column for the release.

An administrator must complete the steps to resolve any of the described issues.

Problem description 2.5.0 3.0.1

Column analysis or data quality analysis jobs fail with a block size error.

If column analysis or data quality analysis jobs fail with the following error, the default transport block size might be too small:
Entry 130: main_program: Step execution finished with status = FAILED.
Entry 225: pxbridge(0),1: Fatal Error: Virtual data set.; output of "pxbridge(0)": the record is too big to fit in a block; the length requested is: 410400, the max block length is: 131072.
Entry 226: node_node2: Player 1 terminated unexpectedly.
Entry 227: main_program: APT_PMsectionLeader(2, node2), player 1 - Unexpected exit status
To resolve the issue, increase the default transport block size (APT_DEFAULT_TRANSPORT_BLOCK_SIZE) by running the following commands in the conductor pod:
/opt/IBM/InformationServer/Server/DSEngine/dsenv;
/opt/IBM/InformationServer/Server/DSEngine/bin/dsadmin -envset APT_DEFAULT_TRANSPORT_BLOCK_SIZE -value "3073896" ANALYZERPROJECT;
/opt/IBM/InformationServer/Server/DSEngine/bin/dsadmin -listenv ANALYZERPROJECT | grep DEFAULT_TRANSPORT_BLOCK

Data quality analysis job fail with a row or column size error.

When you start a data quality analysis on a data asset with 1,000 or more columns, the job fails with a row or column size error while trying to create the PBROWS table.

To resolve the issue, change the limit when the DQA output table is split into multiple tables. Run the following command in the iis-services pod:
/opt/IBM/InformationServer/ASBServer/bin/iisAdmin.sh -s -k com.ibm.iis.ia.max.columns.inDQAOutputTable -value 500

Analysis jobs fail with a post-processing timeout.

Analysis jobs fail with an error similar to the following one:
Post processing for request odf-pc-request-b5701251-6c9d-4b76-a85c-84a753d14186_1555492068657 took too long.
To resolve the issue, increase the timeout value for the ODF engine. Run the following command in the iis-services pod:
/opt/IBM/InformationServer/ASBServer/bin/iisAdmin.sh -s -k com.ibm.iis.ia.server.jobs.postprocessing.timeout -value 1800

Column analysis jobs fail with a CDIIA0237E error.

Column analysis jobs fail with error messages similar to the following ones being written to the SystemOut.log file:
[2/29/20 8:50:17:909 UTC] 00000697 .ascential.investigate.shared.logging.SorcererServiceLogging E CDIIA0237E: Failed to create FreqDist Summary for
com.ascential.investigate.exception.CreateFrequencyDistributionException: Failed to create FreqDist Summary for ec1481df.c862f974.13in0qesq.bdo8ama.8rfbhf.j6f1vh4ig58bk80t9bkfc
at com.ascential.investigate.ca.job.ProfileSummaryBuilder2.createColumnAnalysisSummary(ProfileSummaryBuilder2.java:399)
at com.ascential.investigate.ca.job.BaseProfileJob2.doPostProcess(BaseProfileJob2.java:452)
at com.ascential.investigate.utils.jobs.JobProcessor$PostProcessor.run(JobProcessor.java:628)
at com.ibm.iis.isf.j2ee.impl.lwas.thread.LWASScheduledExecutorService$ContextualExecutorImpl.run(LWASScheduledExecutorService.java:91) 
at sun.reflect.GeneratedMethodAccessor240.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55)
at java.lang.reflect.Method.invoke(Method.java:508)
at com.ibm.ws.context.service.serializable.ContextualInvocationHandler.invoke(ContextualInvocationHandler.java:77)
at com.ibm.ws.context.service.serializable.ContextualInvocationHandler.invoke(ContextualInvocationHandler.java:98)
at com.sun.proxy.$Proxy17.run(Unknown Source)
at com.ibm.iis.isf.j2ee.impl.lwas.thread.LWASScheduledExecutorService$ContextualRunnable.run(LWASScheduledExecutorService.java:144)
at com.ibm.ws.concurrent.internal.SubmittedTask.run(SubmittedTask.java:276)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:522)
at java.util.concurrent.FutureTask.run(FutureTask.java:277) 
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1160)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.lang.Thread.run(Thread.java:811)
Caused by: java.lang.NullPointerException: str cannot be null
at org.apache.wink.json4j.JSON.parse(JSON.java:244) 
at org.apache.wink.json4j.JSON.parse(JSON.java:259)
at com.ascential.investigate.ca.job.ProfileSummaryBuilder2.loadCAResults(ProfileSummaryBuilder2.java:1485)
at com.ascential.investigate.ca.job.ProfileSummaryBuilder2.createColumnAnalysisSummary(ProfileSummaryBuilder2.java:374)
... 16 more
To resolve the issue, change the timeout value for the Kafka consumer. Run the following command in the iis-services pod:
/opt/IBM/InformationServer/ASBServer/bin/iisAdmin.sh -set -k com.ibm.iis.events.kafkaEventConsumer.timeout -value 10000

Analysis or discovery jobs fail with an out-of-memory error.

Analysis or discovery jobs fail with the following error:
\nDataStage job log:\nEntry 131: pxbridge(6),0: The JVM could not be created. Error code:-4 (::createJVM, file CC_JNICommon.cpp, line 652)
To resolve the issue, change heap size settings in the iis-services pod as follows:
/opt/IBM/InformationServer/ASBServer/bin/iisAdmin.sh -set -key com.ibm.iis.ia.jdbc.connector.heapSize -value 2048;
/opt/IBM/InformationServer/ASBServer/bin/iisAdmin.sh -set -key com.ibm.iis.ia.engine.javaStage.heapSize -value 1024
2.5.0 only: In addition, you can update the following configuration settings:
  • Increase the Job Count setting in your workload management system from 4 to 10.
  • Change the maximum number of concurrent jobs in the ODF configuration. Complete the following steps in the conductor pod:
    • Edit the /opt/IBM/InformationServer/ASBNode/conf/odf.properties file and add the following line:
      com.ibm.iis.odf.datastage.max.concurrent.requests=4
    • Save the file.
    • Restart the ODF engine by running the following commands:
      service ODFEngine stop 
      service ODFEngine start
    The changes are not persisted when the conductor pod is restarted.

Health check of the ODF tracker shows an error.

When you continuously run analysis jobs, the ODF tracker health check fails with the following error:
Tracker could not be restored in 30sec and no new request is seen in ODF
To resolve the issue, complete the following steps:
  1. Stop the ISFServer (WebSphere® Application Server) and ODFEngine services.
  2. Create a backup copy of the /var/lib/ibm/ugdata/kafka/data folder. Create a new data folder with the same permissions as the existing one.
  3. Stop the Kafka server by running the following command:
    oc scale sts kafka --replicas=0
  4. Restart the Kafka server by running the following command:
    oc scale sts kafka --replicas=1
  5. Restart the ISFServer and ODFEngine services.

Discovery or data quality analysis jobs fail with an out-of-space error.

Discovery or data quality analysis jobs fail with the following error:
buffer(5),0: Fatal Error: Unable to close file: No space left on device
This error can occur when the /tmp or scratch disk is filling up during job run. Check the available space in both locations. Increase the space if the available space is not sufficient. To find out the location of the scratch disc, check the /opt/IBM/InforamationServer/Server/Configurations/default.apt file for a section similar to this one:
{
   node "node1"
   {
      fastname "dol113pa2wis"
      pools ""
      resource disk "/usr/IBM/InformationServer/Server/Datasets" {pools ""}
      resource scratchdisk "/usr/IBM/InformationServer/Server/Scratch" {pools ""}