IBM Support

N6200 series issue with NVMEM (nonvolatile memory) battery capacity caused by a FCMTO (Fast Charge Mode Time Out) error condition

Troubleshooting


Problem

On N6200 Series systems, when the NVMEM battery reaches the critical low capacity level of 72-hr run-time, the system will initiate a 24-hr shutdown sequence. This technote describes how to determine the Fast Charge Mode Time Out (FCMTO) problem and instructions on how to download and install stand-alone battery firmeware which fixes this issue. Service Processor firmware version 1.3 now includes the battery firmware and is the preferred method to install.

Symptom

There are several ways this issue can manifest, including warning and critical system notifications, depending on the battery charge level.
  • Warning low limit exceeded 80-hr run-time:BATT_CAP_LOW
    Sun Nov 20 00:00:11 CET [befiler04:nvmem.battery.capacity.low.warn:info]: The NVMEM battery capacity is below normal.
  • Critical low limit exceeded 72-hr run-time: LOW_BATT       
    Mon Nov 21 20:00:00 CET [befiler04: monitor.nvramLowBattery:CRITICAL]: NVRAM battery is dangerously low.
    Mon Nov 21 20:00:00 CET [befiler04: monitor.nvramLowBattery.notice:notice]: If the NVRAM battery is dangerously low, the system shuts down automatically every 24 hours to encourage you to replace it. If you reboot the system it will run for another 24 hours before shutting down. (The 24 hour timeout may be increased by altering the "raid.timeout" value using the "options" command.
    Mon Nov 21 20:00:00 CET [befiler04: monitor.shutdown.nvramLowBattery.pending:warning]: NVRAM battery is dangerously low. Halting system in 24 hours. Replace the battery immediately!
    Mon Nov 21 20:00:00 CET [befiler04: asup.throttle.drop:info]: Too many autosupport messages in too short a time, throttling autosupport: BATTERY_LOW

Cause

This message is sent as long as the battery is below the minimum safe voltage required to protect the data in the event of a power failure or unexpected shutdown.

Environment

Scope of Affected Systems: Deliveries for all N6200 series (N6210, N6240 and N6270) filer and gateway systems prior to March 2012 contain battery firmware which is susceptible to the FCMTO issue.

Diagnosing The Problem

AutoSupport Verification of the Issue

AutoSupport (ASUP) messages from a system can be reviewed to determine the status of the NVMEM battery related to this issue:

  • If the system has detected a ‘warning low’ or ‘critical low battery’ condition, the ENVIRONMENT section of the system ASUP will contain battery sensor details.
  • A symptom for this issue can be either that the run-time is less than ‘80-hr Warning’ or less than ‘72-hr Critical’. Systems above 80 hrs run-time will not produce a message.
  • Another symptom is that the “Bat Curr” = 0mA. The charger should be on based on the low capacity level of the battery, and this sensor confirms that there is no current going into the battery.
  • The last symptom for this issue is that ”Charger Volt” displays 0V. In normal operation, it should always be reported as 8200 mV.

Charger Volt   normal   0    mV     --     --      8900 mV     9000 mV
Charger Cycles normal   0    cycles --     --      250 cycles  251 cycles
Charger Curr   normal   0    mA     --     --      2200 mA     2300 mA   
Bat Temp       normal   20   C     -40 C   1 C     70 C        75 C    
Bat Run Time   warnlow  80   hr     72 hr  80 hr   --          --      
Bat Capacity   normal   3072 mA*hr  --     --      --          --      
Bat Curr       normal   0 mA        --     --      2048 mA     2304 mA   
Bat 8.0V       normal   7400 mV     --     --      8900 mV     9000 mV 
Bat 1.8V       normal   1818 mV    1612 mV 1625 mV 1973 mV     1999 mV   

Sometimes, the only information provided in the ENVIRONMENT section of the ASUP is:

Battery: Sensor Bat_Run_Time warning low: current custom is 74 , critical low is 72 , normal low is 80

Systems not on AutoSupport can be interrogated using the SP command system sensors. The details can be found in the procedure provided. 

Resolving The Problem

Solution

This technote describes how to obtain and apply the stand-alone battery firmware, however, the battery firmware is available as a part of Service Processor firmware version 1.3 and is the preferred method to install. See the Service Processor (SP) Firmware for N series Publication Matrix (S7003683) technote for more information on how to obtain and install Service Processor firmware.

The system battery firmware revision should be identified to confirm if a firmware update is required. Follow steps 1-5 of the section How to flash the battery Firmware to confirm if an update to the battery firmware is required.

There are two parts to this procedure, Flashing the Battery Firmware and Resetting the FCMTO flag. The FCMTO flag issue can be confirmed by following steps 1-5 of the Resetting the FCMTO flag procedure. For systems which have not encountered the FCMTO flag issue, there is no need to perform the Resetting the FCMTO flag procedure. There are two scenarios for the systems:

Note: Flashing the battery firmware does not clear the FCMTO flag.
  • Systems which have the FCMTO flag set condition:
    Actions: Flash the battery Firmware and Reset the FCMTO flag
  • Systems which do not have the FCMTO flag issue:
    Actions: Verify the FCMTO flag issue is not present in the system and flash the battery firmware

Notes:
  • The battery firmware update must be performed prior to resetting the FCMTO flag.
  • This is a disruptive procedure and will require system downtime.
  • For HA systems, a takeover and giveback should be managed in conjunction with the system halt required for this procedure, and this might reduce the operational impact of this procedure for some users.

How to obtain the Battery Firmware

The Battery Firmware is installed on your N series hardware and is available for download using the "Software Packages ...." link from the Download page of the N series support website. Refer to the important information for N series support for step-by-step instructions explaining how to access software packages as part of the complete instructions for entitlement and registration. When following the step-by-step instructions, select the Service Processor (SP) Firmware software package from the N series and related Host Software Downloads - Pick page. Download the Stand-alone Battery Firmware for N6200 series file (battery-27100027-ifiles.zip) and using the instructions below to save the two Battery Firmware Files: ‘battery-27100027-nexergy.i’ and ‘battery-27100027-energysales.i’ to the <Battery_Firmware_File_Location>, identified on the user’s Web server.

Important: A N6200 series system with network connectivity through the SP port is required. A Web server is also required to apply the battery firmware update.

Download the updated firmware, extract the files from the archive, and place the files on a Web server in a subnet that is accessible by the Service Processor. We will refer to this location as <Battery_Firmware_File_Location>. This URL location must be less than 100 characters.

How to enable the SP port (if not currently set up)

Perform the following steps:

1. Connect a network cable to the N6200 system SP port.

2. Setup the SP with a static IP address:
  • LOAD-A> sp setup
    The Service Processor (SP) provides remote management capabilities including console redirection, logging and power control.
    It also extends autosupport by sending  additional system event alerts. Your autosupport settings are used for sending these alerts via email over the SP LAN interface.
    Would you like to configure the SP? yes
    Would you like to enable DHCP on the SP LAN interface? no
    Please enter the IP address for the SP [ ]: 192.168.30.243
    Please enter the netmask for the SP [ ]:255.255.254.0
    Please enter the IP address for the SP gateway []:192.168.30.1
    Do you want to enable IPv6 on the SP ? no

How to flash the battery Firmware

1. Use the following command to halt the system.
  • ontap> halt

    1a. Alternatively, a 'cf takeover' can be executed from the partner node in a clustered configuration.
    • From the partner controller issue the following command:
      However, before running the command, ensure that autogiveback is not enabled (that is, ON).
         
      ontap> options cf.giveback.auto.enable
         
      If enabled:
      ontap> options cf.giveback.auto.enable off
      partner_ontap> cf takeover

    1b. If AUTOBOOT environment variable is set to TRUE, the local node boots to
    • ' Waiting for giveback...(Press Ctrl-C to abort wait)' prompt.

      Use control+c to abort waiting and halt the node. Enter 'y' when the system prompts:
      Do you wish to halt this node rather than wait [y/n]?

2. Log in to the SP CLI by using control+G:
  • LOADER> CTRL+G

3. Login as naroot. Input a password if necessary.
  • Switching console to Service Processor
    Service Processor Login: naroot

4. Enable diagnostic privileges and display the battery information:
  • SP>priv set diag

5. Confirm the battery vendor (EnergySales,or Nexergy) and check the rev_hardware values. If the rev_hardware is 0x00d3 or higher, the battery firmware already has the FCMTO related change. In this case, do not upgrade the battery firmware as the these batteries are already updated (you will be prompted with 'are you sure you want to update?' ).
  • SP*>system battery show
        
    chemistry         : LION
    device-name       : bq20z80
    expected-load-mw  : 90
    id                : 27100027
    manufacturer      : EnergySales
    manufacturer-date : 4/4/2010
    rev_cell          : 0x0000
    rev_firmware      : 0x0100
    rev_hardware      : 0x00d1
    TI_fw_version     : 0x0102
    TI_hw_version     : 0xa2
    serial            : 0x0001
    status            : ready
      

6. Flash the battery firmware on to the battery. Use the correct battery firmware file: ‘battery-27100027-nexergy.i’ or ‘battery-27100027-energysales.i’ depending on the manufacturer identified in step 5. A confirmation 'Battery firmware update completed.' will be displayed on successful completion of the operation.
  • For Energy Sales Batteries:
    SP*> system battery flash http://<Battery_Firmware_File_Location>battery-27100027-energysales.i

    -OR-

    For Nexergy Batteries:
    SP*> system battery flash http://<Battery_Firmware_File_Location>battery-27100027-nexergy.i   
       
    Downloading battery firmware image ... Successful
    Accessing battery ...
    Flashing 27100027-EnergySales battery FW revision from 'd1' to 'd3' ... done
    Disabled battery auto update after manual firmware flash
    Battery firmware update completed.

7. Perform a battery verify. A 'Verify test passed' message will be displayed on successful completion of the operation.

  • For Energy Sales Batteries:   
    SP*> system battery verify http://<Battery_Firmware_File_Location>battery-27100027-energysales.i

    • -OR-

    For Nexergy Batteries:   
    SP*> system battery verify http://<Battery_Firmware_File_Location>battery-27100027-nexergy.i   

    Downloading battery firmware image ... Successful
    Verify test passed

8. Verify the rev_hardware has changed to 0x00d3.
  • SP*> system battery show
    chemistry         : LION
    device-name       : bq20z80
    expected-load-mw  : 90
    id                : 27100027
    manufacturer      : EnergySales
    manufacturer-date : 4/4/2010
    rev_cell          : 0x0000
    rev_firmware      : 0x0100
    rev_hardware      : 0x00d3
    TI_fw_version     : 0x0102
    TI_hw_version     : 0xa2
    serial            : 0x0001
    status            : ready

9. Exit the SP shell, (if SSH was used to access the SP, connect to the system console port).
SP*>CTRL-D to exit SP shell

10. Boot ONTAP:
  • LOADER> boot_ontap

    10a. On a cluster system, when the node boots to the 'Waiting for giveback...(Press Ctrl-C to abort wait)' prompt, perform a giveback from the partner node:
    Partner_ontap> cf giveback

Resetting the FCMTO flag

1. Enter the SP shell:
  • ONTAP> CTRL-G

2. Run the following command from the system console port:
  • Login naroot
    Enter password if required

       
    Display the sensor information by running the following command:
       
    SP*> system sensors

    Sensor Name    | Current | Unit        | Status | LCR     | LNC      | UNC     | UCR
    ---------------+---------+-------------+--------+---------+----------+---------+---------
    Bat_1.8V       | 1.819   | Volts       | ok     | 1.613   | 1.625    | 1.974   | 2.000
    Bat_8.0V       | 8.100   | Volts       | ok     | na      | na       | 8.900   | 9.000
    Bat_Curr       | 0.000   | Amps        | ok     | na      | na       | 2.048   | 2.304
    Bat_Capacity   | 3.840   | Amps * hour | ok     | na      | na       | na      | na
    Bat_Run_Time   | 80.00   | hour          crit   | 72.000  | 80.000   | na      | na
    Bat_Temp       | 21.000  | degrees C   | ok     | -40.000 | 1.000    | 70.000  | 75.000
    Charger_Curr   | 0.000   | Amps        | ok     | na      | na       | 2.200   | 2.300
    Charger_Cycles | 0.000   | cycles      | ok     | na      | na       | 250.000 | 251.000
    Charger_Volt   | 0.000   | Volts       | ok     | na      | na       | 8.900   | 9.000


    2a. Optional method to access the SP shell: Use SSH to the SP interface to get to the SP CLI for issuing commands.

3. Check if “Bat_Run_Time” is below 80-hr Warning or below 72-hr Critical.

4. Confirm that “Bat_Curr” = 0 mA and “Charger_Volt” = 0 mV.
  • For systems below 72-hr Critical Low threshold – LOW_BATT, take immediate action. The shutdown sequence has begun.
    For systems below 80-hr but > 72-hr “run time,” there is some time prior to initiating the shutdown sequence.  Under normal conditions, a typical battery can take up to 14 days to decrease from 80 to 72-hr “run time.”

5. Press CTRL-D to exit the SP shell (if SSH was used to access the SP, connect to the system console port).
  • ***Proceed with the following steps during a scheduled maintenance window***

6. Use the following command to halt the system.
  • ontap> halt
  • 6a. Alternatively, a 'cf takeover' can be executed from the partner node in a clustered configuration. From the partner controller, run the following command.
    However, before doing that, ensure autogiveback is not enabled (that is, ON).
       
    ontap> options cf.giveback.auto.enable
    If enabled:
    ontap> options cf.giveback.auto.enable off
    partner_ontap> cf takeover

    6b. If the AUTOBOOT environment variable was set to TRUE, the local node boots to  the ' Waiting for giveback...(Press Ctrl-C to abort wait) ' prompt.
    Press control+c to abort waiting and halt the node. Enter 'y' when the system prompts 'Do you wish to halt this node rather than wait [y/n]?'.

7. Enter Maintenance Mode. While booting Data ONTAP when the CTL-C message appears, press <CTL-C> to interrupt the boot process
  • LOADER> boot_ontap
    Select '5' as the Maintenance Mode boot method.

8. Clear the “no charger” condition by running the following command:
  • *> halt


  •  
9. Recover the system by executing the following command:
  • LOADER> boot_ontap

    For systems with less than 72 hr “runtime,” the system will not boot Data ONTAP until the battery has charged to a minimum of 72 hr “runtime” is reached. This is usually achieved in a few minutes, the command CTRL+G could be used to bypass the wait period depending on the customer environment..
    Note: If the wait period is bypassed the warning messages will continue until the battery is sufficiently charged beyond the 72 hour threshold.

    9a. If a takeover was performed per step 6a, the controller will boot to 'waiting for giveback'. From the partner controller issue the following command.
    Partner_ontap> cf giveback

10. Enter the SP shell, as indicated in step 2.

11. Confirm the battery status by running the system sensors command, as indicated in step 2.

12. Verify that “Charger_Volt” is 8200 mV and there is “Bat_Curr” >0 mA (if the battery needs to be charged).  If the battery is fully charged, Bat_Curr will be 0mA and “Charger_Volt” will be 8.200 V.  Recheck “Bat_Run_Time”. This value should be increasing. For systems which did not reach 72 hr “runtime” threshold, the battery should recover and the warning messages will discontinue.

13. Press CTRL-D to exit the SP shell (if SSH was used to access the SP, connect to the system console port).

Related Links

[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"STFUJD","label":"Network Attached Storage (NAS)->N6240 (2858-E11, C21, E21)"},"Component":" ","Platform":[{"code":"","label":""}],"Version":"","Edition":"","Line of Business":{"code":"","label":""}}]

Document Information

Modified date:
10 January 2020

UID

ssg1S1004006