Rebuild agents

Individual Rebuild Agents address scenarios where Slices are missing from their respective Slicestor nodes or which are corrupted on a drive.

Each Slicestor appliance in a given Storage Pool ensures data integrity across all of the Slicestor appliances which reside in that Storage Pool. Even if a Slicestor appliance goes offline and misses some newly written Slice, the other Slicestor appliances notice that the Slice was missed and rebuild any Slices missing from the Slicestor appliance when it comes back online. These Slicestor nodes read this content of this Slice from Slicestor appliances without missing Slices and recreate the missing Slice on the recovered Slicestor appliance.

Rebuild Agents minimize impact on normal I/O operations to maintain expected system performance levels even during high amounts of rebuild activity. Rebuild activity is scheduled intelligently, so when I/O utilization drops, rebuilding becomes more aggressive, enhancing reliability without affecting performance.

Rebuild Agents operate continuously on all Slicestor appliances. To ensure a fair balance between rebuild and normal client I/O operations, adaptive algorithms within the Rebuilder continuously sample their impact to performance and reduce the rebuilding rate when they detect adverse impact to client performance.
Note: In general, the defined parameters together with the adaptive strategy provide desirable results and do not require tweaking. Discuss creating specific rebuilding optimization with your IBM Customer Success Engineer.


Example: Rebuilder Adaptive System Behavior

The Rebuilder’s adaptive behavior changes in response to client I/O operations (indicated 
in green). When client operations increase, the Rebuilder decreases. When client 
operations reduces activity, the Rebuilder increases its activity as much as possible 
without negatively impacting to client operations. The Rebuilder samples performance 
many times a second to respond to changes in client operations.

In the figure below, when I/O increases suddenly, the rebuilding rate decreases rapidly, 
to enable the client activity to achieve its maximum level of I/O.
Figure 1. An illustration of how rebuilding rate (blue) adapts to changes of client rate (green).
Rebuilder Adaptive System Behavior