HDFS name node events
The following table lists the events that are created for the HDFS name node component.
Event | Event Type |
Severity | Call Home | Details |
---|---|---|---|---|
hdfs_namenode_active | STATE_CHANGE | INFO | no | Message: HDFS NameNode service state for HDFS cluster {0} is ACTIVE. |
Description: The HDFS NameNode service is in ACTIVE state, as expected. | ||||
Cause: N/A | ||||
User Action: N/A | ||||
hdfs_namenode_config_missing | STATE_CHANGE | WARNING | no | Message: HDFS NameNode configuration for cluster {0} is missing. |
Description: The HDFS NameNode configuration for the HDFS cluster is missing on this node. | ||||
Cause: The /usr/lpp/mmfs/hadoop/sbin/mmhdfs config get core-site.xml -k dfs.nameservices command did not report a valid HDFS cluster name. | ||||
User Action: Ensure that the configuration is uploaded by using the
command. |
||||
hdfs_namenode_error | STATE_CHANGE | ERROR | no | Message: HDFS NameNode health for HDFS cluster {0} is invalid. |
Description: The HDFS NameNode service has an invalid health state. | ||||
Cause: The /usr/lpp/mmfs/hadoop/sbin/mmhdfs monitor checkHealth -Y command returned with error. | ||||
User Action: Validate that the HDFS configuration is valid and try to start the NameNode service manually. | ||||
hdfs_namenode_failed | STATE_CHANGE | ERROR | no | Message: HDFS NameNode health for HDFS cluster {0} failed. |
Description: The HDFS NameNode service is failed. | ||||
Cause: The /usr/lpp/mmfs/hadoop/sbin/mmhdfs monitor checkHealth -Y command returned a FAILED healthState. | ||||
User Action: Start the Hadoop NameNode service. | ||||
hdfs_namenode_initializing | STATE_CHANGE | INFO | no | Message: HDFS NameNode service state for HDFS cluster {0} is INITIALIZING. |
Description: The HDFS NameNode service is in INITIALIZING state. | ||||
Cause: N/A | ||||
User Action: N/A | ||||
hdfs_namenode_krb_auth_failed | STATE_CHANGE | WARNING | no | Message: HDFS NameNode check health state failed with kinit error for cluster {0}. |
Description: The kerberos authentication that is required to query whether the health state is failed. | ||||
Cause: The /usr/lpp/mmfs/hadoop/sbin/mmhdfs monitor checkHealth -Y command failed with rc=2 (kinit error). | ||||
User Action: Ensure that the that the 'KINIT_KEYTAB' and 'KINIT_PRINCIPAL' Hadoop environment variables are correctly configured in the 'hadoop-env.sh'. | ||||
hdfs_namenode_ok | STATE_CHANGE | INFO | no | Message: HDFS NameNode health for HDFS cluster {0} is OK. |
Description: The HDFS NameNode service is running. | ||||
Cause: N/A | ||||
User Action: N/A | ||||
hdfs_namenode_process_down | STATE_CHANGE | ERROR | no | Message: HDFS NameNode process for HDFS cluster {0} is down. |
Description: The HDFS NameNode process is down. | ||||
Cause: The /usr/lpp/mmfs/hadoop/bin/hdfs --daemon status namenode command reported that the process is dead. | ||||
User Action: Start the Hadoop NameNode process by using the mmces service start hdfs command. | ||||
hdfs_namenode_process_unknown | STATE_CHANGE | WARNING | no | Message: HDFS NameNode process for HDFS cluster {0} is unknown. |
Description: The HDFS NameNode process is unknown. | ||||
Cause: The /usr/lpp/mmfs/hadoop/bin/hdfs --daemon status namenode command reported unexpected results. | ||||
User Action: Check the HDFS Namenode service and if needed, then restart it by using the mmces service start hdfs command. | ||||
hdfs_namenode_process_up | STATE_CHANGE | INFO | no | Message: HDFS NameNode process for HDFS cluster {0} is OK. |
Description: The HDFS NameNode process is running. | ||||
Cause: N/A | ||||
User Action: N/A | ||||
hdfs_namenode_standby | STATE_CHANGE | INFO | no | Message: HDFS NameNode service state for HDFS cluster {0} is in STANDBY. |
Description: The HDFS NameNode service is in STANDBY state, as expected. | ||||
Cause: N/A | ||||
User Action: N/A | ||||
hdfs_namenode_stopping | STATE_CHANGE | INFO | no | Message: HDFS NameNode service state for HDFS cluster {0} is STOPPING. |
Description: The HDFS NameNode service is in STOPPING state. | ||||
Cause: N/A | ||||
User Action: N/A | ||||
hdfs_namenode_unauthorized | STATE_CHANGE | WARNING | no | Message: HDFS NameNode check health state failed for cluster {0}. |
Description: Failed to query the health state because of missing or wrong authentication token. | ||||
Cause: The /usr/lpp/mmfs/hadoop/sbin/mmhdfs monitor checkHealth -Y command failed with rc=3 (permission error). | ||||
User Action: Ensure that the 'KINIT_KEYTAB' and 'KINIT_PRINCIPAL' Hadoop environment variables are configured in the 'hadoop-env.sh'. | ||||
hdfs_namenode_unknown_state | STATE_CHANGE | WARNING | no | Message: HDFS NameNode service state for HDFS cluster {0} is UNKNOWN. |
Description: The HDFS NameNode service is in UNKNOWN state, as expected. | ||||
Cause: The /usr/lpp/mmfs/hadoop/sbin/mmhdfs monitor checkHealth -Y command returned an UNKNOWN serviceState. | ||||
User Action: N/A | ||||
hdfs_namenode_wrong_state | STATE_CHANGE | WARNING | no | Message: HDFS NameNode service state for HDFS cluster {0} is unexpected {1}. |
Description: The HDFS NameNode service state is not as expected. For example, in STANDBY but is supposed to be ACTIVE or vice versa. | ||||
Cause: The /usr/lpp/mmfs/hadoop/sbin/mmhdfs monitor checkHealth -Y command returned serviceState, which does not match the expected state when looking at the assigned CES IP attributes. | ||||
User Action: N/A |