Failures and troubleshooting

Deployment options: Netezza Performance Server for Cloud Pak for Data System

Learn how to proceed if a failure occurs during the expansion process.

If nzredrexpand fails or gets interrupted, it can be restarted by providing the --resume argument. Although the tool attempts to handle many common failure scenarios automatically, manual intervention might be necessary when critical failures that cause the system to go to the Down state happen. After you resolve the issues, the recovery action depends on the state of Netezza Performance Server, and the stage of expansion.

Expansion fails while a host backup is being taken

WORKAROUND:

Make sure that there is enough space at the location where the tool is trying to back up.
When the issue is resolved, run the nzredrexpand --resume command.
```
nzredrexpand --resume
```

Expansion fails while a configuration is being generated

If expansion fails while configuration is being generated, it is most likely caused by nodes not being configured correctly on Cloud Pak for Data System.

WORKAROUND:

Make sure that nodes are configured correctly on Cloud Pak for Data System.
When this issue is resolved, run the nzredrexpand --resume command.
```
nzredrexpand --resume
```
Start Netezza Performance Server if it is stopped.
```
nzstart
```

Expansion fails while the database is being locked

If expansion fails while the database is being locked, the failure might be caused by issues with the engine.

WORKAROUND:

Resolve any engine-related issues.
When the issues are resolved, run the nzredrexpand --resume command.
```
nzredrexpand --resume
```

Expansion fails during nzstart -expand, before the system reaches the Pre-Online state

If expansion fails during nzstart -expand, before the system reaches the Pre-Online state, the failure might be caused by problems with the hardware.

WORKAROUND:

Restore the configuration.
Run the nzhostrestore command or the backup that was taken by the tool.
Resolve any hardware-related issues.
When the issues are resolved, run the nzredrexpand --resume command.
```
nzredrexpand --resume
```

Expansion fails during nzstart -expand after the system reaches the Pre-Online state, but before the Online state

If the expansion fails during nzstart -expand after the system reaches the Pre-Online state, but before the Online state, the system needs developer involvement. The nzhostrestore command does not work because the disk labels get updated during the Pre-Online state, and host restore depends on the labels.

Redistribution is stopped

If the system does not go Down, the table redistribution phase proceeds automatically through various types of failures.

If the system goes Down, redistribution is stopped.

WORKAROUND:

Identify and resolve any hardware-related issues.
Bring the system back Online.
```
nzstart
```
Resume the expansion process by running the nzredrexpand --resume command.
```
nzredrexpand --resume
```

Node e2n1 is not known to Magneto, or has no BMC

When you are running the nzredrexpand command, you might see the following error message.

Exception: Node e2n1 is not known to Magneto, or has no BMC

WORKAROUND:

Power cycle the e2 enclosure nodes.

for node in e2n{1..4}bmc; do echo $node; ipmitool -I lanplus -H $node -U USERID -P PASSW0RD power status ; done

Wait for 10 minutes.
Resume the expansion process by running the nzredrexpand --resume command.
```
nzredrexpand --resume
```