IBM Support

InfoSphere DataStage: Big Data File stage - potential issues sharing libhdfs via NFS mount

Troubleshooting


Problem

When the Hadoop Distributed File System (HDFS) NameNode and the InfoSphere Datastage parallel engine are located on different systems, you must ensure that the Big Data File Stage can still access various HDFS jars, the HDFS conf dir, and libhdfs. One method for providing access to these HDFS components is NFS mounting directories from the HDFS server on the parallel engine server. However, special care must be taken with trying to access libhdfs via this method.

Symptom

Some HDFS distributions install libhdfs in a system directory (for example /usr/lib64). Such directories contain other libraries that are important to system operation. Therefore NFS mounting this directory from a remote system can cause system wide issues when the the library versions for the remote and local systems are mismatched. It can be especially problematic if the remote directory is mounted over the existing local /usr/lib64.

If the remote /usr/lib64 is mounted over the local /usr/lib64, even small library version mismatches between the systems can cause system wide program failures when the programs try to load the shared libraries needed from /usr/lib64. Programs like vi, ls, rpm, shutdown, etc. all can fail with "Failed to load shared library" errors in this configuration. Mounting over the local /usr/lib64 is not recommended.

Even if a new mount point is created on the server where the parallel engine is installed (for example, /usr/lib64_hdfs), and the new directory is only added to the LD_LIBRARY_PATH that is used by InfoSphere DataStage (in the dsenv file), version mismatches between the systems can still cause jobs to fail. In this configuration, the failures are limited to InfoSphere DataStage and will not effect other system programs.

Resolving The Problem

If the HDFS distribution being used provides libhdfs in a separately installable package, install the package on the server where the parallel engine is installed rather than trying to access libhdfs from the remote system. When the library is installed locally on the server where the parallel engine is installed, mounting /usr/lib64 from the HDFS system is no longer required.

If libhdfs must be accessed remotely, complete the following steps:

- Ensure that both systems are of the same distribution, release, and patch level (for example, SUSE Linux Enterprise 11 Service Pack 1 installed on both computers, not SUSE Linux Enterprise 11 Service Pack 1 on one computer and Red Hat Enterprise Linux 6 Update 3 on the other computer).

- Create a new mount point directory on the computer where the parallel engine is installed (for example. /usr/lib64_hdfs), and NFS mount over that new mount point

- Only add the new mount point directory to the LD_LIBRARY_PATH in the dsenv file. Do not add the new mount point directory to the system wide LD_LIBRARY_PATH. You want to confine any possible version mismatch side effects to InfoSphere DataStage.

[{"Product":{"code":"SSVSEF","label":"IBM InfoSphere DataStage"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Component":"Not Applicable","Platform":[{"code":"PF016","label":"Linux"}],"Version":"9.1.2.0;9.1.0.1;9.1;11.5;11.3.1.2;11.3.1.1;11.3.1.0;11.3","Edition":"All Editions","Line of Business":{"code":"LOB10","label":"Data and AI"}},{"Product":{"code":"SSZJPZ","label":"IBM InfoSphere Information Server"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Component":"Not Applicable","Platform":[{"code":"PF016","label":"Linux"}],"Version":"9.1","Edition":"All Editions","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
16 June 2018

UID

swg21614720