Buffered logging and filtering

Learn to configure the Apache Log4j utility's buffering and filtering features, which have demonstrated reduction of load on the logging subsystem, indirectly increasing HDFS Transparency throughput.

By enabling HDFS Transparency log buffering, the I/O performance increases. This increased performance occurs because the HDFS workloads are not blocked by log messages being flushed to disk. If log messages are flushed to disk, it can adversely affect the performance of workloads in a busy cluster.

To mitigate said adverse effects, a buffered logging based on Apache Log4j was introduced since HDFS Transparency versions 3.2.2-6 and 3.1.1-15. Enabling buffered logging provides better HDFS Transparency throughput in comparison to unbuffered logging, if enabling logging is a requirement in order to debug any issue or otherwise.

Steps to configure buffered logging

  1. In the /var/mmfs/hadoop/etc/hadoop/log4j.properties configuration file, add or update the following lines:
    log4j.appender.RFA=org.apache.hadoop.hdfs.server.namenode.GPFSRollingFileAppender
    log4j.appender.RFA.bufferedIO=true
    log4j.appender.RFA.bufferSize=8192
    
    Note: Comment out the earlier Apache Log4j appender.
  2. Upload configurations to IBM Storage Scale CCR by issuing the following command:
    # mmhdfs config upload
  3. If Cloudera is integrated, restart HDFS Transparency from Cloudera Manager.

    Otherwise, restart HDFS by using this command:

    # mmhdfs hdfs restart

Steps to configure filtering

You can selectively filter out log messages being logged into the log file. The filtering feature allows you to suppress messages that usually are not needed, for example:

log4j.appender.RFA.filter.1=org.apache.log4j.varia.StringMatchFilter
log4j.appender.RFA.filter.1.StringToMatch=Removing expired token
log4j.appender.RFA.filter.1.AcceptOnMatch=false

With this configuration, the log messages that include the text "Removing expired token" are not logged. For information about other filters, see the Apache Log4j template file (/usr/lpp/mmfs/hadoop/template/log4j.properties.template).