Heap allocation errors with InfoSphere DataStage Parallel Jobs on AIX

You can correct heap allocation errors in several ways based on which version of InfoSphere® DataStage® you are running.

Symptoms

InfoSphere DataStage parallel jobs ends with the following error message:
APT_BadAlloc: Heap allocation failed.

Causes

AIX® divides memory address space into segments. If the InfoSphere DataStage jobs must allocate more memory than exists in the number of available segments, the job ends with a heap allocation or failure to allocate memory.

Resolving the problem

Verify that ulimit is not restricting the memory allocation of the process. Check the ulimit from within InfoSphere DataStage itself because the value is not the same as the value in the interactive shell. To capture ulimit for all nodes, use the following process:

  1. Create a new Parallel job.
  2. Add an External Source stage (under File on the palette) connected to a peek stage (under Development/Debug on the palette).
  3. Access the advanced properties of the External Source stage, and make sure its running in Parallel mode.
  4. In the External source stage, enter ulimit -a; ulimit -aH without the quotations in the Source Program property and a column as VarChar with length of 255.
  5. Use a configuration file that includes at least one node for each fast name (host) in your cluster or GRID.
  6. Compile the job, run it, and look in the Director log.

The director contains the soft limits and the hard limits for each node in the configuration file. If the hard limit for data is too low, you must contact your AIX administrator to increase that value. This value can be set in the file /etc/security/limits. After you increase the hard limit settings, you can set the ulimit settings for the user in the ds.rc file that is located under $DSHOME/sample. You can add a line like ulimit -d unlimited at the beginning of the file, after the umask settings.

The ds.rc file is owned by root, and is writable only to root, so your system administrator must change the file permissions. For security reasons, do not change the owner or grant write permission to any non-root user. Important: do not set the number of file descriptors to unlimited with the ulimit -n command. That setting causes a problem with InfoSphere DataStage. Ensure that the value for this limit is set sufficiently high. A safe value is 100000 in nearly all situations

InfoSphere DataStage Version 7.5.x

The InfoSphere DataStage Software is a 32-bit application for all 7.5.x releases, even when installed onto an AIX server with a 64-bit kernel. To obtain the maximum amount of process address space for your parallel job processes, set the LDR_CNTRL variable with a value of MAXDATA=0x80000000@DSA as the default value at the project level (for all jobs in a project) or within specific jobs.

Important: Do not add LDR_CNTRL to your dsenv file. That setting might interfere with the memory model used by the Server Engine.
InfoSphere DataStage Version 8.0.x

The InfoSphere DataStage Software is a 32-bit application for all 8.0.x releases, even when installed onto an AIX server with a 64-bit kernel. Starting with the Information Server 8.0 GA release, InfoSphere DataStage now starts Java™ components to integrate with the services tier. For these Java components to function properly, the LDR_CNTRL=MAXDATA=0x60000000@USERREGS environment variable is added to the dsenv file. This variable must not be removed or modified to ensure the correct operation of the Java components.

For parallel jobs that require more than 1.5GB of memory per process, the LDR_CNTRL variable can be set to a larger value. This variable must be given a default value at the project level if you want it to take effect for all jobs in the project, or by leaving the project default value blank and assigning a value to specific jobs only. As stated previously, do not alter LDR_CNTRL within the dsenv file.

To obtain the maximum amount of process address space for your job processes, set the LDR_CNTRL variable with a value of MAXDATA=0x80000000@DSA in your job or as a project default.

InfoSphere DataStage Version 8.1.x

Starting with the 8.1 GA release, InfoSphere DataStage is now a 64-bit application and requires a 64-bit AIX kernel. The osh item is compiled with the MAXDATA=x80000000 property, so the amount of memory address space available to the parallel job process is limited to 2 GB in the default configuration. The improvement of being a 64-bit application allows for the allocation of more segments and a larger private memory address space. For situations where large amounts of heap memory are required for each process, set LDR_CNTRL with a value of MAXDATA=0x0000001000000000. This value allocates up to 64 Gb for private data for each process.

Set this large value at the job level rather than at the project level to avoid large consumption of memory by jobs where you did not intentionally want this behavior.

InfoSphere DataStage Version 8.5

InfoSphere DataStage is a 64-bit application and requires a 64-bit AIX kernel just like release 8.1. A significant improvement at this release, is that the MAXDATA parameter is removed from the executable. With this change, InfoSphere DataStage is now able to access all of the available memory address segments in the default configuration. Any jobs or projects that had LDR_CNTRL specified with the MAXDATA parameter must be modified to remove this parameter after you upgrade to 8.5 so that you are able to access all of the segments.

Important: The LDR_CNTRL=USERREGS environment variable must not be removed from the dsenv file. The variable is required for correct operation of Java components that are loaded by the InfoSphere DataStage processes. The USERREGS property does not affect the memory usage of InfoSphere DataStage jobs.