Greenplum parallel file distribution program (gpfdist)
The Greenplum Connector stage exchanges data with the Greenplum server by using the Greenplum file distribution program, which is called gpfdist.
The gpfdist program runs on the database client,
and it must be installed on the InfoSphere Information Server engine
tier computer. For data to be transferred by using the gpfdist protocol,
a network route must be present to enable bidirectional access by
using an IP address and optionally the presence of a DNS server to
facilitate the name resolution. The connector invokes a gpfdist process
on every physical computer node and creates the external table. The
host of the external table data is identified by the fastname entry
in the parallel engine configuration file ($APT_CONFIG_FILE).
In order for the connector to invoke gpfdist on
each engine tier, the location of gpfdist(%GPHOME_LOADERS%\bin) must
be in the system path. In addition, the location of gpfdist dependent
libraries (%GPHOME_LOADERS%\lib) must be in the
system library path. On Windows, the system environment variable PATH is
updated in the Advanced system settings. On Linux, the PATH environment
variable is updated in the dsenv script.
Note: On
Windows, the Greenplum installer adds %GPHOME_LOADERS%\bin and %GPHOME_LOADERS%\lib to
the PATH system environment variable. Verify that
these directories are in the PATH.
For manually adding %GPHOME_LOADERS%\bin and %GPHOME_LOADERS%\lib to the PATH system environment variable see the topic on setting the library path environment variable.