Watchdog timer
The IBM® MQ Appliance has a baseboard management controller (BMC) that provides a watchdog timer.
The watchdog timer allows you to detect and recover from a serious malfunction on the appliance, even if the appliance is at a remote location. When the appliance is running normally, the appliance firmware informs the BMC that all is well every few seconds. If the BMC receives no such notification for a specified time (by default, twenty minutes), it restarts the appliance.
If you want to change the default behavior of the watchdog timer, or implement some of the other available features, you can configure the BMC.
You use the Intelligent Platform Management Interface (IPMI) to configure the BMC. Commands that are sent over IPMI are independent of the appliance CPU, firmware, and operating system. The BMC can still be accessed when the appliance is powered off (provided that it is plugged into power).
You must meet the following requirements before you can configure the BMC:
- You must use the mgt0 interface on the appliance for your IPMI connection, see IPMI LAN channel commands.
- You must create a special IPMI user, IPMI user commands.
- You require a remote system, for example a Linux® host, running a suitable IPMI client. (The examples use a Linux command line IPMI client called ipmitool, see https://linux.die.net/man/1/ipmitool).
Examples
The following examples show basic watchdog timer configuration, by using ipmitool commands.
ipmitool -L operator -I lanplus -H ipmi_channel_IP -U ipmi_user
-P ipmi_password mc watchdog get
- ipmi_channel_IP is the IP address that you allocated to the appliance when you configured the IPMI interface on mgt0.
- ipmi_user is the name of the ipmi user that you configured on the appliance.
- ipmi_password is the password for the ipmi user.
Watchdog Timer Use: SMS/OS (0x44)
Watchdog Timer Is: Started/Running
Watchdog Timer Actions: Hard Reset (0x01)
Pre-timeout interval: 0 seconds
Timer Expiration Flags: 0x00
Initial Countdown: 1200 sec
Present Countdown: 1199 sec
- Watchdog Timer Is
- Reports the current running state of the watchdog timer.
- Watchdog Timer Action
- Describes what is done when the timer reaches 0. The default is to restart the appliance.
- Initial Countdown
- The total timer wait time.
- Present Countdown
- The current timer value.
ipmitool -L operator -I lanplus -H ipmi_channel_IP -U ipmi_user
-P ipmi_password mc watchdog off
Watchdog Timer Shutoff successful -- timer stopped
ipmitool -L operator -I lanplus -H ipmi_channel_IP -U ipmi_user
-P ipmi_password mc reset warm
Sent warm reset command to MC