Monitoring the status of data collection

Monitor the status of metadata collection for your devices and troubleshoot issues when they occur. Some statuses that might represent an issue include degraded, device unreachable, failed, insufficient role, not monitored, stopped, and task expired.

Depending on the type of device, IBM Storage Insights collects performance, configuration, capacity, and status metadata. This metadata is collected by the following tasks: performance monitors and probes:
  • Performance monitors collect performance metadata and run every 5 minutes for most devices. For Dell EMC storage systems, performance monitors run every 15 minutes.
  • Probes collect configuration, capacity, and status metadata and run once every 24 hours. For DS8000 storage systems and their pools, used capacity values are collected once every hour.
To monitor the status of metadata collection, identify interruptions or errors, and minimize gaps, you can check the values for Data Collection, Probe Status, and Performance Monitor Status on the following pages in IBM Storage Insights:
  • Block Storage: Resources > Block Storage Systems
  • File Storage: Resources > File Storage Systems
  • Object Storage: Resources > Object Storage Systems
  • Switches: Resources > Switches
  • Hosts: Resources > Hosts
Tip: To help you quickly identify any issues that might be occurring, the value for Data Collection represents the most severe status between Probe Status and Performance Monitor Status.

Data collection statuses

The values for Data Collection, Probe Status, and Performance Monitor Status provide a real-time status of your data collection for each device that you monitor. Use the following statuses to identify when a problem occurs and what you can do to help resolve it:

Table 1. Data collection statuses and what you can do
Status Explanation
Degraded Not all metadata for the device was collected. This status is displayed when metadata collection is interrupted and only partial metadata is available.
What to do
Wait 30 minutes and try to collect data again for the device. To restart metadata collection, right-click the device and select Data collection > Restart Data Collection.
Device error, contact hardware support Metadata cannot be collected because of a hardware error on the storage system. For example, on a DS8000, this status might occur when the processor enclosure (also known as the central electronic complex or CEC) is down or unavailable.
What to do
This status represents a serious problem with the storage system. If the problem persists in subsequent metadata collections, open a support case for the storage system at https://www.ibm.com/support. To help IBM® Support understand your issue more quickly, include the storage system model and any actions that preceded the error status.
Device is not providing valid performance data The performance metadata that was collected for the device doesn't match the expected values based on historical analysis. This analysis examines the performance counters (metadata) for a device. This status is displayed when the counters decrease (rather than increase) between consecutive metadata collections. In those cases, the counters are discarded and the related metrics are not calculated.
What to do
You can ignore this status under the following conditions:
  • The device was rebooted or reset since the last metadata collection.
  • The CIM or other connecting agent for the device was rebooted or reset since the last metadata collection.
If this status continues to be displayed, reboot or reset the device and its connecting agent (if it uses one). Then, wait for performance metadata collection to run two consecutive times.
Device is not providing valid probe data The probe metadata that was collected for the device is incomplete or corrupted and can't be displayed.
What to do
This status represents a serious problem with the metadata collection for the device. If the problem persists in subsequent metadata collections, open a support case for IBM Storage Insights at https://www.ibm.com/support. To help IBM Support understand your issue more quickly, include the device type and any actions that preceded the "Device is not providing valid probe data" status.
Device unreachable A device is offline or your data collectors can't access the device. To collect detailed metrics and status information, a device must be online and a data collector must be connected to it.
What to do
  1. Ensure that the device is online.
  2. In IBM Storage Insights, go to Configuration > Data Collectors and click the Assignments tab.
  3. Verify that a data collector is assigned to the device.
  4. If a data collector is not assigned to the device, set Manage the assignments between data collectors and devices to Off. Then, IBM Storage Insights automatically assigns a data collector to the device. For information about how to manually assign a data collector to a device, see https://www.ibm.com/docs/en/storage-insights?topic=collectors-assigning-data.
Tip: If none of your existing data collectors can access the device, you must deploy one on a server or virtual machine that is visible to the device. For information about how to deploy a data collector, see https://www.ibm.com/docs/en/storage-insights?topic=started-downloading-installing-data-collectors.
Failed Metadata was not collected for the device. This status might be displayed for a number of conditions, such as a service interruption, a network outage, or a device that is unavailable. If the failure was caused by an interruption or a global problem with the service, IBM is investigating the issue and you'll be notified when the data collection service is resumed.
What to do
  1. Ensure that the device is online.
  2. Verify that your network is up and running.
  3. Right-click the device and select Data collection > Restart Data Collection.
Firmware level not supported Metadata cannot be collected for a device because the level of its firmware is not supported.
What to do
Update the firmware on the device to a version that is supported by IBM Storage Insights. For information about the supported firmware levels, go to https://www.ibm.com/docs/en/storage-insights?topic=overview-supported-devices and click a device.
Insufficient role to collect data The role of the user that IBM Storage Insights uses to connect to a device doesn't have the authority to collect metadata. You must update the connection information to use a different user, or change the role of the user on the device. For more information about the required roles for metadata collection, see https://www.ibm.com/docs/en/storage-insights?topic=systems-user-roles-collecting-metadata-from-storage.
What to do
To change the user that IBM Storage Insights uses to connect to a device, right-click the device and select Connections > Modify Connection. Then, enter the new user name and password and click OK.
If you want to continue using the same user, you must change the role of that user on the device. To change the role of the user on the device, open the element manager for the device and follow the instructions in the vendor's documentation.

For information about the required user roles and how to manually start the collection of performance metadata from IBM Storage Virtualize, see the link: https://www.ibm.com/docs/en/storage-insights?topic=svcsv-user-roles-collecting-performance-metadata-from-spectrum-virtualize

Invalid credentials The user name or password that IBM Storage Insights uses to connect to a device is not correct. This status is displayed when the credentials of the user on the device were changed but were not update in IBM Storage Insights, the user name was removed from the device, or the credentials were entered incorrectly in IBM Storage Insights.
What to do
To update the user and password that IBM Storage Insights uses to connect to a device, right-click the device and select Connections > Modify Connection. Then, enter the correct user name and password and click OK.
Start of changeNo Call Home contactEnd of change Start of changeCall Home with cloud services is unable to contact the storage system. To collect status, configuration, capacity, and performance metadata, Call Home with cloud services must be able to access the device.
What to do

To resolve the problem, restart metadata collection for the device. To restart metadata collection, right-click the device and select Data collection > Restart Data Collection. If the status does not change, wait 30 minutes and try again. If the problem persists, open a support case for IBM Storage Insights at https://www.ibm.com/support. To help IBM Support understand your issue more quickly, include the device type and any actions that preceded the "No call home contact" status

End of change
No data collector available A data collector is not assigned to a device or your data collectors can't access the it. To collect status, configuration, capacity, and performance metadata, a data collector must be connected to a device.
What to do
  1. Go to Configuration > Data Collectors and click the Assignments tab.
  2. Verify that a data collector is assigned to the device.
  3. Set Manage the assignments between data collectors and devices to Off. Then, IBM Storage Insights automatically assigns a data collector to the device. If you want to manually assign a data collector to a device, see https://www.ibm.com/docs/en/storage-insights?topic=collectors-assigning-data.
Not Monitored (hosts) This status is displayed when IBM Storage Insights monitors the storage system that the host is connected to, but the host itself was not added for monitoring. Unmonitored hosts are automatically created based on the host connections of monitored storage systems. Each host connection is represented as an unmonitored host.
What to do
This status does not represent a problem, but if you want to collect more detailed metadata about a host, you must add it for monitoring. To add a host, complete the following steps:
  1. Identify the vCenter Server that manages the host.
  2. Go to Resources > Hosts.
  3. Click Add vCenter Server.
  4. Enter the IP address or hostname and user credentials for connecting to the vCenter Server. After the vCenter Server is added for monitoring, metadata data will be collected about the host.
For more information about how to add a host, see Adding hosts.
Not Monitored (switches) When you add a chassis, its hosted switches are automatically discovered and added for monitoring. Any other switches that are connected to the switches on the monitored chassis are also discovered.
  • If chassis that host the other, connected switches use the same connection credentials as the chassis that you added, the chassis and their switches are also added for monitoring.
  • If chassis that host the other, connected switches don't use the same credentials, the chassis and their switches are added to IBM Storage Insights but are not monitored.
What to do
To change the user name or password that IBM Storage Insights uses to connect to a chassis, complete the following steps:
  1. Go to Resources > Switches.
  2. Click the Chassis tab.
  3. Right-click the chassis and select Configure Data Collection.
  4. Enter the new user name and password and click OK.
When IBM Storage Insights connects to the chassis with the updated credentials, its switches are automatically added for monitoring and their "Not Monitored" status is changed.
Stopped This status is displayed when data collection is manually stopped or when data collection was restarted but the restart failed.
What to do
Manually restart metadata collection. To restart metadata collection, right-click the device and select Data collection > Restart data collection.
For situations where the restart failed, wait 30 minutes and try to restart data collection again. If the problem persists after you restart data collection, open a support case for IBM Storage Insights at https://www.ibm.com/support. To help us understand your issue more quickly, include the device type and any actions that preceded the "Stopped" status.
Task expired This status might be displayed for a number of conditions or temporary problems within the service.
What to do
  • If the Probe Status column shows the value Task Expired and the last successful probe was more than 24 hours ago, manually restart metadata collection. To restart metadata collection, right-click the device and select Data collection > Restart data collection.
  • If the Performance Monitor Status column shows the value Task Expired and the last successful performance monitor was more than 15 minutes ago, manually restart metadata collection. To restart metadata collection, right-click the device and select Data collection > Restart data collection.
For situations where the restart failed or the Task Expired status is still shown, wait 30 minutes and try to restart data collection again. If the problems persist after you restart data collection, open a support case for IBM Storage Insights at https://www.ibm.com/support. To help us understand your issue more quickly, include the device type and any actions that preceded the "Task expired" status.
Unknown This status might be displayed if the probe or performance monitor had an error status that is no longer true. For example, if the status of previous probe was "Invalid Credentials" or "Device Unreachable" and that problem is resolved, Unknown is displayed. The next run of a probe or performance monitor clears this status.
What to do
To clear the status, you can wait until the next scheduled probe or performance monitor runs or manually restart metadata collection. To restart metadata collection, right-click the device and select Data collectionRestart data collection.
For situations where the restart failed, wait 30 minutes and try to restart data collection again. If the problem persists after you restart data collection, open a support case for IBM Storage Insights at https://www.ibm.com/support. To help us understand your issue more quickly, include the device type and any actions that preceded the "Unknown" status.
Zimon is not running The ZIMon collector on the IBM Spectrum® Scale cluster node is not running and metadata can't be collected.
What to do
Ensure that the ZIMon collector is running on the IBM Spectrum Scale cluster node. Then, try to collect metadata again. To manually restart metadata collection, right-click the device and select Data collection > Restart data collection.
For situations where the restart failed, wait 30 minutes and try to restart data collection again. If the problem persists after you restart data collection, open a support case for IBM Storage Insights at https://www.ibm.com/support. To help us understand your issue more quickly, include the device type and any actions that preceded the "Zimon is not running" status.

What to do if problems persist

If the provided actions don't help you to resolve issues with metadata collection, IBM Support can help. To get help, open a support case for IBM Storage Insights at https://www.ibm.com/support. To help us understand your issue more quickly, include the data collection status in your case.