Best practices for data redistribution
Data redistribution can be optimally performed when best practices for data redistribution are followed.
- Ensure that all documented data redistribution prerequisites have
been met.
By default, redistribution operations that are not roll-forward recoverable perform a precheck and proceed only if the verification completes successfully. To verify the prerequisites without launching the redistribution operation, specify the PRECHECK ONLY command parameter.
- Gather information and metrics about your database
environment.
If performance changes after the redistribution, you can use the information and metrics to identify the reason for the change.
- Back up the database before you perform the data
redistribution.
This is especially important if the redistribution operation is not roll-forward recoverable; if a catastrophic failure occurs during the redistribution and the database is lost or a table is corrupted, you can restore the database from this backup.
- Perform the redistribution during a planned outage,
if possible. The instance does not need to be stopped; quiescing the
database is sufficient.By default, redistribution operations that are not roll-forward recoverable force all users off the database and put the database into a quiesced mode. You must still ensure that any applications or users that are allowed to access the quiesced database are not making updates, for the following reasons:
- If a disaster occurs, recovering data changes that occurred during the redistribution is complex and, in some cases, not possible.
- Redistribution typically uses a lot of resources on the servers. Parallel querying of data might cause both the redistribution and the queries to slow down significantly. Redistributing the data online can also cause lock timeouts or deadlocks.
- Perform a redistribution operation that is not
roll-forward recoverable.
Data is moved in bulk instead of by internal insert and delete operations. This reduces the number of times that a table must be scanned and accessed, which results in better performance.
Log records are not required for each of the insert and delete operations. This means that you do not need to manage large amounts of active log space and log archiving space in your system when performing data redistribution.
- A uniform distribution of data might not always
result in the best database performance. If a uniform distribution
is not desired, then you can construct your own target partition map
for the redistribution operation.
In general if you redistribute data in a frequently accessed table such that infrequently accessed data is on few database partitions in the database partition group, and the frequently accessed data is distributed over a larger number of database partitions, you can improve data access performance and throughput for the most frequently run applications that access this data.