Daily monitoring checklist

To ensure that you are completing the daily monitoring tasks for your solution, review the daily monitoring checklist.

Complete the daily monitoring tasks from the Operations Center Overview page. You can access the Overview page by opening the Operations Center and clicking Overviews.

The following figure shows the location for completing each task.

The image is a graphical depiction of the Overview page, and provides the location for each task in the checklist.

Tip: To run administrative commands for advanced monitoring tasks, use the Operations Center command builder. The command builder provides a type-ahead function to guide you as you enter commands. To open the command builder, go to the Operations Center Overview page. On the menu bar, hover over the globe icon and click Command Builder.

The following table lists the daily monitoring tasks and provides instructions for completing each task.

Table 1. Daily monitoring tasks
Task Basic procedures Advanced procedures and troubleshooting information
In the illustration of the Overview page, the number 1 corresponds to the Clients area. Determine whether clients are at risk of being unprotected due to failed or missed backup operations. To verify whether clients are at risk, in the Clients area, look for an At risk notification. To view details, click the Clients area.
If you installed the client management service on a backup-archive client, you can view and analyze the client error and schedule logs by completing the following steps:
  1. In the Clients table, select the client and click Details.
  2. To diagnose an issue, click Diagnosis.
For clients that do not have the client management service installed, access the client system to review the client error logs.
In the illustration of the Overview page, the number 2 corresponds to the Alerts area. Determine whether client-related or server-related errors require attention. To determine the severity of any reported alerts, in the Alerts area, hover over the columns. To view additional information about alerts, complete the following steps:
  1. Click the Alerts area.
  2. In the Alerts table, select an alert.
  3. In the Activity Log pane, review the messages. The pane displays related messages that were issued before and after the selected alert occurred.
In the illustration of the Overview page, the number 3 corresponds to the Servers area. Determine whether servers that are managed by the Operations Center are available to provide data protection services to clients.
  1. To verify whether servers are at risk, in the Servers area, look for an Unavailable notification.
  2. To view additional information, click the Servers area.
  3. Select a server in the Servers table and click Details.
Tip: If you detect an issue that is related to server properties, update the server properties:
  1. In the Servers table, select a server and click Details.
  2. To update server properties, click Properties.
In the illustration of the Overview page, the number 4 corresponds to the Inventory area. Determine whether sufficient space is available for the server inventory, which consists of the server database, active log, and archive log.
  1. Click the Servers area.
  2. In the Status column of the table, view the status of the server and resolve any issues:
    • Normal The icon is a check mark. Sufficient space is available for the server database, active log, and archive log.
    • Critical The icon is a circle with an X mark. Insufficient space is available for the server database, active log, or archive log. You must add space immediately, or the data protection services that are provided by the server will be interrupted.
    • Warning The icon is a triangle with an exclamation mark. The server database, active log, or archive log is running out of space. If this condition persists, you must add space.
    • Unavailable The icon resembles a cracked ball. Status cannot be obtained. Ensure that the server is running, and that there are no network issues. This status is also shown if the monitoring administrator ID is locked or otherwise unavailable on the server. This ID is named IBM®-OC-hub_server_name.
    • Unmonitored The icon is a question mark in a diamond. Unmonitored servers are defined to the hub server, but are not configured for management by the Operations Center. To configure an unmonitored server, select the server, and click Monitor Spoke.
You can also look for related alerts on the Alerts page. For additional instructions about troubleshooting, see Resolving server problems (V7.1.1).
In the illustration of the Overview page, the number 5 corresponds to the DB2 area. Verify server database backup operations. To determine when a server was most recently backed up, complete the following steps:
  1. Click the Servers area.
  2. In the Servers table, review the Last Database Backup column.
To obtain more detailed information about backup operations, complete the following steps:
  1. In the Servers table, select a row and click Details.
  2. In the DB Backup area, hover over the check marks to review information about backup operations.
If a database was not backed up recently (for example, in the last 24 hours), you can start a backup operation:
  1. On the Operations Center Overview page, click the Servers area.
  2. In the table, select a server and click Back Up.
To determine whether the server database is configured for automatic backup operations, complete the following steps:
  1. On the menu bar, hover over the globe icon and click Command Builder.
  2. Issue the QUERY DB command:
    query db f=d
  3. In the output, review the Full Device Class Name field. If a device class is specified, the server is configured for automatic database backups.
In the illustration of the Overview page, the number 6 corresponds to the Servers menu. Monitor other server maintenance tasks. Server maintenance tasks can include running administrative command schedules, maintenance scripts, and related commands. To search for information about processes that failed because of server issues, complete the following steps:
  1. Click Servers > Maintenance.
  2. To obtain the two-week history of a process, view the History column.
  3. To obtain more information about a scheduled process, hover over the check box that is associated with the process.
For more information about monitoring processes and resolving issues, see the Operations Center online help.
In the illustration of the Overview page, the number 7 corresponds to the Activity area. Verify that the amount of data that was recently sent to and from servers is within the expected range.
  • To obtain an overview of activity in the last 24 hours, view the Activity area.
  • To compare activity in the last 24 hours with activity in the previous 24 hours, review the figures in the Current and Previous areas.
  • If more data was sent to the server than you expected, determine which clients are backing up more data and investigate the cause. It is possible that client-side data deduplication is not working correctly.
  • If less data was sent to the server than you expected, investigate whether client backup operations are proceeding on schedule.
In the illustration of the Overview page, the number 8 corresponds to the Pools area. Verify that storage pools are available to back up client data.
  1. If problems are indicated in the Storage & Data Availability area, click Pools to view the details:
    • If the Critical The icon is a circle with an X mark. status is displayed, insufficient space is available in the storage pool, or its access status is unavailable.
    • If the Warning The icon is a triangle with an exclamation mark. status is displayed, the storage pool is running out of space, or its access status is read-only.
  2. To view the used, free, and total space for your selected storage pool, hover over the entries in the Capacity Used column.

To view the storage-pool capacity that was used over the past two weeks, select a row in the Storage Pools table and click Details.

In the illustration of the Overview page, the number 9 corresponds to the Devices area.Verify that storage devices are available for backup operations. In the Storage & Data Availability area, click Devices to open the Storage Devices table. If a Critical The icon is a circle with an X mark. or Warning The icon is a triangle with an exclamation mark. status is displayed for any device, investigate the issue.
Tip: Devices might have a critical or warning status for the following reasons:
  • For DISK device classes, volumes might be offline or have a read-only access status.
  • For FILE device classes that are not shared, directories might be offline. Also, insufficient free space might be available for allocating scratch volumes.