Grouping data by using collocation in server storage pools

Use collocation to improve IBM Spectrum® Protect performance and maintain optimal data organization.

Before you begin

Tip: The following information does not apply to container storage pools.

When you use collocation, the performance of restore operations for large amounts of data can be significantly improved because fewer mounts are required to locate the necessary files. Generation of backup sets and export operations are also faster. In addition, collocation decreases the chance for media contention with other clients. While performance is improved by using collocation, enabling it increases both the amount of server time that is needed to collocate files for storing and the number of volumes that are required for data storage.

You can enable collocation by node, group, or file space. Collocation by group is the default. Each option provides different benefits and considerations for performance.

Table 1. Collocation trade-offs
Type	Volume usage	Volume mounts	Restore time
No collocation	Low volume usage	Few number of mounts for migration and reclamation	Longest restore time
Collocated by node	High volume usage	High number of mounts for migration and reclamation	Good restore time, but not optimized for multi-session restore
Collocated by group	Low volume usage	Few mounts for migration and reclamation	Good restore time
Collocated by file space	High volume usage	High number of mounts for migration and reclamation	Good restore time, but not optimized for multi-session restore

About this task

Consider the following information when you are determining which type of collocation you want to use:

Collocation by group provides the best balance of restore performance versus tape volume efficiency and it is the best practice choice for most situations. Collocation by group results in a reduction in unused tape capacity, which allows more collocated data on individual tapes. If collocation is needed to improve restore performance, use collocation by group. Manage the number of nodes in the groups so that backup data for the entire group is spread over a manageable number of volumes.
For primary storage pools on tape, use collocation by group:
- To get the full benefit of collocation by group, you must define the collocation groups and their nodes.
- Nodes that are not grouped are collocated by node.
For nodes with two or more large file spaces that might get close to filling a tape volume, use collocation by file space.
Use an active data pool to collocate active data.
Group nodes that have a low chance of being restored at the same time to avoid volume contention.
Group nodes that are backed up to disk at the same time.

To enable collocation, use the COLLOCATE parameter on the DEFINE STGPOOL command when you are defining a primary sequential-access, copy, or active-data storage pool. You can use the UPDATE STGPOOL command to enable collocation for an existing storage pool.