Configuring an Apache HDFS external storage manager
Content Manager OnDemand supports data storage in an Apache Hadoop Distributed File System (HDFS).
The Apache® Hadoop® project develops a variety of open-source software for reliable, scalable, distributed computing. The project includes Apache HDFS, which is a distributed file system that provides high-throughput access to application data. More information on Apache HDFS can be found at: https://hadoop.apache.org/
Updating the ARS.CFG file
Perform these steps to configure Apache HDFS on a z/OS server.
Two new entries must be added to the ARS.CFG file.
ARS_HDFS_CONFIG_FILE=/usr/lpp/ars/V10R5M0/config/ars.hdfs ARS_HDFS_CONFIG_DIR=/usr/lpp/ars/V10R5M0/config
The
ARS_HDFS_CONFIG_FILE
entry specifies an existing Apache HDFS configuration file which the server uses by default.The
ARS_HDFS_CONFIG_DIR
entry specifies the directory in which any alternate configuration files are kept. This directory is used if additional Apache HDFS configuration files are defined. The names of these additional configuration files can be specified when defining storage nodes in Content Manager OnDemand. If no configuration file is specified in the storage node, the default configuration file is used.The configuration file name and directory path shown in the examples are the recommended values for these entries.
The
ARS_STORAGE_MANAGER
entry in the ARS.CFG file might also need to be changed. If you specifyARS_STORAGE_MANAGER=CACHE_ONLY
, this disables all storage managers supported by Content Manager OnDemand.To configure the Content Manager OnDemand server to use Apache HDFS as a storage manager, the value must be set to the following:ARS_STORAGE_MANAGER=NO_TSM
- This setting will enable all external storage managers supported by Content Manager OnDemand except Tivoli® Storage Manager, which is not supported on z/OS. This setting is used when the additional software to support Tivoli Storage Manager is not installed and Tivoli Storage Manager is not required as an external storage manager.
Creating an Apache HDFS configuration file
An Apache HDFS configuration file for Content Manager OnDemand contains entries specific to your Apache HDFS implementation. You specify the location and name of the default configuration file in the ARS.CFG entry. Required entries must be specified. Optional entries are not required in the configuration file unless those values need to be changed.
The following list describes the entries that can be specified in an Apache HDFS configuration file.
- ARS_HDFS_SERVER
- Specifies the Apache HDFS server name. Do not include
http://
orhttps://
in the name. This entry is required. - ARS_HDFS_PORT
- Specifies the Apache HDFS server port number. This entry is optional if using a standard port. Content Manager OnDemand assumes port 80 for HTTP or port 443 for HTTPS communications.
- ARS_HDFS_TLD
- Specifies the Apache HDFS top-level directory name. This is any additional path information after the server name and port in the URL. This entry is optional.
- ARS_HDFS_USE_SSL
- Indicates whether or not to use SSL in server communications.
The possible values are:
0
- SSL will not be used1
- SSL will be used
- ARS_HDFS_AUTH_TYPE
- Specifies the user authentication type. The possible values are:
NONE
- Open systemKNOX
- Access and authenticate through Apache Knox
- ARS_HDFS_CONNECT_ TIMEOUT
- Specifies the maximum number of seconds that Content Manager OnDemand waits for a response from the
storage manager. The default is
60
. This entry is optional. Warning: Setting this value too low might cause connection failures. - ARS_HDFS_FILE_PERMS
- Specifies the permissions for new files. The default is
440
. This entry is optional. - ARS_HDFS_HLD
- Specifies the high-level directory name. This attribute is available to group sets of Content Manager OnDemand data together which might be needed if sharing external storage among multiple Content Manager OnDemand servers. Warning: Once this value is set, it must not be changed. If it is changed, any data that is already stored will not be retrievable. There is no default value. This entry is optional.
http://hdfs.example.com/webhdfs/v1
, the Apache
HDFS configuration file
contains:ARS_HDFS_SERVER=hdfs.example.com
ARS_HDFS_TLD=/webhdfs/v1
Defining an Apache HDFS storage node with the Administrator client
You can define the settings for using the Apache HDFS access method on the Add a Primary Node dialog of the OnDemand Administrator client.
The Storage Node field is not used for communication with the Apache HDFS server and can be set to any name you choose.
The Logon field is the user name from the Apache HDFS system which Content Manager OnDemand uses to store and retrieve data. A password might not be required for open Apache HDFS systems, so this field is optional.
The Access Method radio button is set to Apache HDFS
. The Configuration File
Name defaults to the value specified by the ARS_HDFS_CONFIG_FILE
parameter in
the ARS.CFG file if no value is entered. Otherwise, Content Manager OnDemand looks
for the configuration file in the directory defined by the ARS_HDFS_CONFIG_DIR
parameter specified in the ARS.CFG file.