Failures and troubleshooting

Deployment options: Netezza Performance Server for Cloud Pak for Data System

Learn how to proceed if a failure occurs during the expansion process.

If nzredrexpand fails or gets interrupted, it can be restarted by providing the --resume argument. Although the tool attempts to handle many common failure scenarios automatically, manual intervention might be necessary when critical failures that cause the system to go to the Down state happen. After you resolve the issues, the recovery action depends on the state of Netezza Performance Server, and the stage of expansion.

Expansion fails while a host backup is being taken
WORKAROUND:
  1. Make sure that there is enough space at the location where the tool is trying to back up.
  2. When the issue is resolved, run the nzredrexpand --resume command.
    nzredrexpand --resume
Expansion fails while a configuration is being generated
If expansion fails while configuration is being generated, it is most likely caused by nodes not being configured correctly on Cloud Pak for Data System.

WORKAROUND:

  1. Make sure that nodes are configured correctly on Cloud Pak for Data System.
  2. When this issue is resolved, run the nzredrexpand --resume command.
    nzredrexpand --resume
  3. Start Netezza Performance Server if it is stopped.
    nzstart
Expansion fails while the database is being locked
If expansion fails while the database is being locked, the failure might be caused by issues with the engine.

WORKAROUND:

  1. Resolve any engine-related issues.
  2. When the issues are resolved, run the nzredrexpand --resume command.
    nzredrexpand --resume
Expansion fails during nzstart -expand, before the system reaches the Pre-Online state
If expansion fails during nzstart -expand, before the system reaches the Pre-Online state, the failure might be caused by problems with the hardware.

WORKAROUND:

  1. Restore the configuration.
  2. Run the nzhostrestore command or the backup that was taken by the tool.
  3. Resolve any hardware-related issues.
  4. When the issues are resolved, run the nzredrexpand --resume command.
    nzredrexpand --resume
Expansion fails during nzstart -expand after the system reaches the Pre-Online state, but before the Online state
If the expansion fails during nzstart -expand after the system reaches the Pre-Online state, but before the Online state, the system needs developer involvement. The nzhostrestore command does not work because the disk labels get updated during the Pre-Online state, and host restore depends on the labels.
Redistribution is stopped
If the system does not go Down, the table redistribution phase proceeds automatically through various types of failures.
If the system goes Down, redistribution is stopped.

WORKAROUND:

  1. Identify and resolve any hardware-related issues.
  2. Bring the system back Online.
    nzstart
  3. Resume the expansion process by running the nzredrexpand --resume command.
    nzredrexpand --resume
Node e2n1 is not known to Magneto, or has no BMC
When you are running the nzredrexpand command, you might see the following error message.
Exception: Node e2n1 is not known to Magneto, or has no BMC

WORKAROUND:

  1. Power cycle the e2 enclosure nodes.
    for node in e2n{1..4}bmc; do echo $node; ipmitool -I lanplus -H $node -U USERID -P PASSW0RD power status ; done
  2. Wait for 10 minutes.
  3. Resume the expansion process by running the nzredrexpand --resume command.
    nzredrexpand --resume