Learn how to enable Apache Ranger plug-in for
HDFS Transparency.
Make sure to meet the following prerequisites before you enable the Apache Ranger plug-in for HDFS Transparency:
- Set up Apache Ranger according to its installation
instructions.
- Install a relational database management system (RDBMS) supported by Apache Ranger, such as MySQL or MariaDB.
- Verify that Ranger Admin, Ranger Usersync, and Ranger TagSync are successfully installed and
without errors.
- Even though not mandatory for installing and using Apache Ranger, it is strongly recommended to enable Kerberos
in your Hadoop. This data security tool ensures that all requests are authenticated, which is very
important for authorization and auditing. Without Kerberos, the users would be able to impersonate
other users and workaround any authorization policies.
- Make sure that Apache Solr is
working well for Apache Ranger. When properly configured,
Apache Solr is used by Apache Ranger to store audit logs; Apache Solr also provides a search capability of the audit
logs through the Ranger Admin GUI.
- Stop the HDFS Transparency by using the following command:
- To enable Apache Ranger for HDFS Transparency,
log in to one of the HDFS Transparency nodes and change the configuration as described in this
step.
For hadoop-env.sh, set the following configuration:
Note: Based on your environment, substitute the correct path to the Apache Ranger
ranger-hdfs-plugin library.
for f in <ranger_hdfs_plugin_directory>/lib/*; do
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$f
done
for f in /usr/share/java/mysql-connector-java.jar; do
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$f
done
For core-site.xml, set the following configuration:
<property>
<name>hadoop.security.auth_to_local</name>
<value>
RULE:[2:$1@$0](rangeradmin@<REALM_NAME>)s/(.*)@<REALM_NAME>/ranger/
RULE:[2:$1@$0](rangertagsync@<REALM_NAME>)s/(.*)@<REALM_NAME>/rangertagsync/
RULE:[2:$1@$0](rangerusersync@<REALM_NAME>)s/(.*)@<REALM_NAME>/rangerusersync/
……
DEFAULT
</value>
<final>false</final>
</property>
For hdfs-site.xml, set the following configuration:
<property>
<name>dfs.namenode.inode.attributes.provider.class</name>
<value>org.apache.ranger.authorization.hadoop.RangerHdfsAuthorizer</value>
<final>false</final>
</property>
- Copy the following configuration files from the Apache Ranger installation directory to an HDFS Transparency node configuration directory
(/var/mmfs/hadoop/etc/hadoop). These configuration files are generated by the
enable-hdfs-plugin.sh script when the Apache Ranger plug-in is enabled.
- ranger-hdfs-audit.xml
- ranger-hdfs-security.xml
- ranger-policymgr-ssl.xml
- To synchronize the configuration in all the HDFS Transparency nodes, issue the following
command:
- Create ranger, rangertagsync, and
rangerusersync using the gpfs_create_hadoop_users_dirs.py
script.
Log in to a CES HDFS NameNode and run the following commands:
# /usr/lpp/mmfs/hadoop/scripts/gpfs_create_hadoop_users_dirs.py --create-custom-hadoop-user-group ranger
# /usr/lpp/mmfs/hadoop/scripts/gpfs_create_hadoop_users_dirs.py --create-custom-hadoop-user-group rangertagsync
# /usr/lpp/mmfs/hadoop/scripts/gpfs_create_hadoop_users_dirs.py --create-custom-hadoop-user-group rangerusersync
- To ensure that changes are effective, start the HDFS Transparency by using the following
command: