HDFS encryption

IBM Storage Scale already offers in-built encryption support. HDFS level encryption for IBM Storage Scale HDFS transparency connector is also supported.

It is important to understand the difference between HDFS level encryption and in-built encryption with IBM Storage Scale. HDFS level encryption is per user based whereas in-built encryption is per node based. Therefore, if the use case demands more fine-grained control at the user level, use HDFS level encryption. However, if you enable HDFS level encryption, you will not be able to get in-place analytics benefits such as accessing the same data with HDFS and POSIX/NFS.

This is supported since HDFS Transparency 3.0.0-0 and 2.7.3-4. This requires Ranger and Ranger KMS and this has only been tested over HortonWorks stack. If you plan to enable this for open source Apache, you should enable it on the native HDFS first and confirm it is working before you switch native HDFS into HDFS Transparency.

To enable native HDFS encryption, configure gpfs.ranger.enabled=true in gpfs-site.xml and configure the following value for gpfs-site.xml from Ambari GUI:

Configuration Value (default)
gpfs.encryption.enabled true (false)
gpfs.ranger.enabled true (true)
Note: From HDFS Transparency 3.1.0-6 and 3.1.1-3, ensure that the gpfs.ranger.enabled field is set to scale. The scale option replaces the original true/false values.

Known limits