Content Manager OnDemand supports data storage in an Apache Hadoop Distributed File System (HDFS).
The Apache® Hadoop® project develops a variety of open-source software for reliable, scalable, distributed computing. The project includes Apache HDFS, which is a distributed file system that provides high-throughput access to application data. More information on Apache HDFS can be found at: https://hadoop.apache.org/
Perform these steps to configure Apache HDFS on an AIX, Linux, or Linux on System z server.
Two new entries must be added to the ARS.CFG file.
ARS_HDFS_CONFIG_FILE=/opt/IBM/ondemand/V10.1/config/ars.hdfs
ARS_HDFS_CONFIG_DIR=/opt/IBM/ondemand/V10.1/config
ARS_HDFS_CONFIG_FILE=/opt/ibm/ondemand/V10.1/config/ars.hdfs
ARS_HDFS_CONFIG_DIR=/opt/ibm/ondemand/V10.1/config
The ARS_HDFS_CONFIG_FILE
entry
specifies an existing Apache HDFS configuration file which the server
uses by default.
The ARS_HDFS_CONFIG_DIR
entry
specifies the directory in which any alternate configuration files
are kept. This directory is used if additional Apache HDFS configuration
files are defined. The names of these additional configuration files
can be specified when defining storage nodes in Content Manager OnDemand. If no configuration file is
specified in the storage node, the default configuration file is used.
The configuration file name and directory path shown in the examples are the recommended values for these entries.
The ARS_STORAGE_MANAGER
entry in the ARS.CFG
file might also need to be changed. If you specify ARS_STORAGE_MANAGER=CACHE_ONLY
,
this disables all storage managers supported by Content Manager OnDemand.
ARS_STORAGE_MANAGER=TSM
ARS_STORAGE_MANAGER
value
is set to TSM
.ARS_STORAGE_MANAGER=NO_TSM
Apache HDFS
as an external storage manager.C:\Program Files\IBM\OnDemand\V10.1\config
C:\Program
Files\IBM\OnDemand\V10.1\config\ars.hdfs
A sample configuration
file is included as part of the installation of Content Manager OnDemand.An Apache HDFS configuration file for Content Manager OnDemand contains entries specific to your Apache HDFS implementation. You specify the location and name of the default configuration file in the ARS.CFG entry or via the OnDemand Configurator. Required entries must be specified. Optional entries are not required in the configuration file unless those values need to be changed.
The following list describes the entries that can be specified in an Apache HDFS configuration file.
http://
or https://
in
the name. This entry is required.0
- SSL will not be used1
- SSL will be usedNONE
- Open systemKNOX
- Access and authenticate through Apache
Knox60
. This entry is
optional. Warning: Setting this value too low might cause connection
failures.440
.
This entry is optional.http://hdfs.example.com/webhdfs/v1
,
the Apache HDFS configuration file contains:ARS_HDFS_SERVER=hdfs.example.com
ARS_HDFS_TLD=/webhdfs/v1
You can define the settings for using the Apache HDFS access method on the Add a Primary Node dialog of the OnDemand Administrator client.
The Storage Node field is not used for communication with the Apache HDFS server and can be set to any name you choose.
The Logon field is the user name from the Apache HDFS system which Content Manager OnDemand uses to store and retrieve data. A password might not be required for open Apache HDFS systems, so this field is optional.
The Access Method
radio button is set to Apache HDFS
. For Content Manager OnDemand servers running on all platforms
except Windows, the Configuration File Name defaults to the value
specified by the ARS_HDFS_CONFIG_FILE
parameter in
the ARS.CFG file if no value is entered. Otherwise, Content Manager OnDemand looks for the configuration
file in the directory defined by the ARS_HDFS_CONFIG_DIR
parameter
specified in the ARS.CFG file. For Content Manager OnDemand servers
running on Windows, the server uses the Configuration File Name field
and the Configuration Directory field that are specified in the OnDemand
Configurator instead of using the ARS.CFG file parameters.