Greenplum parallel file distribution program (gpfdist)

The Greenplum Connector stage exchanges data with the Greenplum server by using the Greenplum file distribution program, which is called gpfdist.

The gpfdist program runs on the database client, and it must be installed on the InfoSphere Information Server engine tier computer. For data to be transferred by using the gpfdist protocol, a network route must be present to enable bidirectional access by using an IP address and optionally the presence of a DNS server to facilitate the name resolution. The connector invokes a gpfdist process on every physical computer node and creates the external table. The host of the external table data is identified by the fastname entry in the parallel engine configuration file ($APT_CONFIG_FILE). In order for the connector to invoke gpfdist on each engine tier, the location of gpfdist(%GPHOME_LOADERS%\bin) must be in the system path. In addition, the location of gpfdist dependent libraries (%GPHOME_LOADERS%\lib) must be in the system library path. On Windows, the system environment variable PATH is updated in the Advanced system settings. On Linux, the PATH environment variable is updated in the dsenv script. 
Note: On Windows, the Greenplum installer adds %GPHOME_LOADERS%\bin and %GPHOME_LOADERS%\lib to the PATH system environment variable. Verify that these directories are in the PATH.

For manually adding %GPHOME_LOADERS%\bin and %GPHOME_LOADERS%\lib to the PATH system environment variable see the topic on setting the library path environment variable.