Operations

Events, incidents, and logs

New Audit Messages

When device replacement with data evacuation starts, the following audit message displays:

Device: slicestor-mvm14.cleversafelab.com
The 'slicestor' 'slicestor-mvm14.cleversafelab.com' was modified. The data 
evacuation was " and is now 'evacuating'. The data evacuation rate limit was 
" and is now 'enabled'.
The data evacuation rate limit was " and is now set to '777 MB/s'.

Account:     Admin
Request IP:  10.10.13.10
Source:      UI
Action code: Edit Storage Entry
When a data evacuation rate limit is set, the following audit message displays:

Storage Pool: vmpool
The storage pool 'vmpool' was modified. The following Slicestor device(s) 
were added to the storage pool: slicestor-mvm19.cleversafelab.com. The 
following Slicestor device(s) were removed from the storage pool 
licestor-mvm14.cleversafelab.com.

Account:     Admin
Request IP:  10.10.13.10
Source:      UI
Action code: Edit Storage Entry
When data evacuation is paused, the following audit message displays:


Device: slicestor-mvm10.cleversafelab.com
The slicestor 'slicestor-mvm10.cleversafelab.com' was modified. 
The data evacuation was 'evacuating' and is now 'paused'.

Account:     Admin
Request IP:  10.10.13.10
Source:      Rest API
Action code: Device Data Evacuation
When data evacuation is resumed, the following audit message displays:

Device: slicestor-mvm10.cleversafelab.com
The slicestor 'slicestor-mvm10.cleversafelab.com' was modified. 
The data evacuation was 'paused' and is now 'evacuating'.

Account:     Admin
Request IP:  10.10.13.10
Source:      Rest API
Action code: Device Data Evacuation

New Events

When data evacuation completes, the following event message displays with an Info icon:

Device: slicestor-mvm1.cleversafelab.com
Data evacuation has completed. 15.29 GB out of 15.29 GB evacuated.
When an error occurs during data evacuation, the following event message displays with an Error icon and an open incident is created:

Device: slicestor-mvm10.cleversafelab.com
Data evacuation is not making any progress due to I/O errors. Please make sure both 
the source and destination devices are healthy and accessible.
When an error during data evacuation is resolved, the following event message displays with a check mark icon and the open incident is cleared:

Device: slicestor-mvm10.cleversafelab.com
Data evacuation is now proceeding as expected.

Troubleshooting

Troubleshoot Data Evacuation Incidents

Troubleshoot Data Evacuation Incidents
Table 1. Request Parameters for Replace Storage Pool Device (replaceStoragePoolDevice) API method
Incident Action
Error occurred during data evacuation This error is caused when the data evacuation progress tracker does not persist to the source device OS disk. If the problem persists or causes evacuation to halt, the user can replace the OS drive on the Source device to continue data evacuation. Please see instructions for replacing the OS drive on the IBM Cloud Object Storage Slicestor® 2210, 2212, 1440, 2440, and 4100 devices.
Data evacuation is reporting too many I/O errors. Please make sure both the source and destination devices are healthy and accessible. This incident may be seen due to I/O errors on data disks on either the Source or the Destination device.
  • Check to see if the source and destination devices are accessible over the network.
  • Check to see if there are disks in a Quarantined state on either of these devices.
    • The disks may be resumed first and if they get quarantined again, they may need to be failed. See the Quarantined section of the Manager Admin Manual on how to resume or fail a Quarantined drive.
    • If a quarantined disk is failed on the source device, the data slices on this disk may never be evacuated. These data slices will have to be rebuilt later.
    • If a quarantined disk is failed on the destination device, the data slices that have already been evacuated to it may be lost. These data slices will have to be rebuilt later. If no action is taken, data evacuation will continue to make progress until no further progress can be made due to errors. In this case, the next incident will appear. If the problem persists, please contact an IBM Customer Success Engineer.
Data evacuation is not making any progress due to I/O errors. Please make sure both the source and destination devices are healthy and accessible. Please follow all the steps from the previous incident. If this incident appears without any quarantined disks on either the Source or the Destination device, it may be due to low error ratio that doesn’t cause disks to become quarantined. In this case data evacuation will continue to make progress until no further progress can be made. This may result in evacuation to be halted. At this point, the operator may terminate the data evacuation and let Rebuilder rebuild the remaining data slices.
Attention: Terminating the data evacuation results in a loss of any data slices that have not been evacuated. If an evacuation is terminated, please wait for the lost data slices to be rebuilt before attempting another data evacuation.