Viewing data source status

Use the Home page to monitor your environment for storage system capacity, used capacity, records indexed, and duplicate files. You can also view data usage for a specific area of your organization.

The data in the home page is updated periodically. The last update is indicated by a time stamp.

Viewing storage system capacity

Use the Data source capacity area to view capacity usage compared to the allocated capacity for all data sources that are registered with IBM Spectrum® Discover. The data sources can be a mixture of file systems and object vaults. A graph provides a convenient view of the current capacity of data sources and whether any are close to running out of space. This view also indicates the number of files to move or archive, based on user-defined policies.

Hover over a data source in the graph to view details about the data source. Click a data source to open the Search page and perform a search of the selected data source.

Note: Data sources that do not have data residing in them are not displayed in the graph.
Figure 1. Datasource capacity
Datasource capacity

Understanding size and capacity differences

IBM Spectrum Discover collects size and capacity information. Generally:
  • Size refers to the size of a file or object in bytes.
  • Capacity refers to the amount of space the file or object consumes on the source storage in bytes.

For objects, size and capacity values always match. For files, size and capacity values can be different because of file system block overhead or sparsely populated files.

Note: Storage protection overhead (such as RAID values or erasure coding) and replication overhead are not captured in the capacity values.

The Data source capacity area contains a view to visualize the data in a tabular representation. To see the view, click on the table button on the chart toolbar.

Datasourcecapacity

The tabular representation provides a complementary view to the analysis of the information on the chart. On the Group column, you can see the recommended to move, used, and free values that indicate the number of files to move or archive based on user-defined policies. On the y-value column, you can see the names of the data sources, and on the x-value column, you can see the size and capacity information in bytes.

Tabularrepresentation

Viewing used capacity

Use the Capacity Used by area to view graphs with an aggregated display of capacity usage for selected metadata attributes. You can view capacity for both primary and backup sources. The graphs provide details about capacity usage by aggregating across different attributes that are available from standard system metadata.

Use the Capacity Used by list to select an attribute and display the capacity consumers of that attribute in the graphs.

The Used graph displays the highest consumers of capacity for the selected attribute, in order of consumption.

The data source graph displays the percentage of overall usage per data source for the selected attribute. You can select a specific capacity consumer to display in the graph.

Hover over a value in a graph to view details. Click a value in a graph to open the Search page and search the selected item.

Figure 2. Example of the capacity that is being used
Capacity used by

Viewing records indexed

Use the Records Indexed area to view both the total number of records and the capacity of the records that are indexed by IBM Spectrum Discover. This view provides a summary view of total storage usage.

Records indexed

Click the Total Records Indexed value to open the Search page and perform a search of the indexed records.

Viewing duplicate file information

Use the Duplicate File Information area to view information about possible duplicate files within the storage environment. Possible duplicate files are files with the same name and size but different paths or object names. The number of duplicates and the capacity that is consumed by these files is displayed. You can also use a report that provides detailed and sorted information for the potential duplicates.

Click the Duplicate Records value to open the Search page and perform a search of duplicate records.

Duplicate file information
Identifying potential duplicates can be resource-intensive on IBM Spectrum Discover. By default, the background task that refreshes potential duplicate information is disabled. However, you can update potential duplicate information either on demand or on a specific schedule. If you disable duplicate background task, the dashboard shows the following message:
duplicate file info

To view and manage how often data in the home page is updated, navigate to Data connections > Discover database under Data source management window.

Figure 3. Run table refresh button in the Discover database window
Run table refresh button in the Discover database window

From here, you can enable or disable the automatic updating of summary information. You can update information on the home page on demand by clicking the Run table refresh.