Scenario: Cloud Administrator

The cloud administrator role can use information gathered by the Monitoring Agent for IBM® Cloud Pak System Software to solve problems that might occur with the system and affect any applications that are deployed.

About this task

The administrator, Thomas, has deployed the System Monitoring service and the agent is running and gathering information about the associated cloud.

Procedure

  1. Thomas logs into the system and opens the IBM Cloud Pak System Software Monitoring Portal.
  2. He then uses the Monitoring Agent for IBM Cloud Pak System Software to access the Cloud Pak System Software Overview workspace.
  3. The Situation Event Console view features a situation alert, denoting that one of the compute nodes is experiencing problems.
  4. Thomas links to the Failure Analysis workspace to further explore the reasons for the situation alert.
    The Failure Analysis view shows that the compute node is indeed unavailable .
  5. From the Failure Analysis workspace, Thomas links to the Compute Node Performance workspace.
    The Top 5 CPU Utilizers view shows that the CPU usage on the compute node in question is unusually high.
  6. Thomas decides that maintenance and repair work must be carried out on the compute node.
    He then uses the Virtual Machine Performance workspace to find the virtual machines associated with the compute node.
  7. Thomas sends an email to all application deployment owners, listing the virtual machines that will be affected by the downtime, which is required to carry out maintenance and repair work on the compute node.
  8. After the maintenance and repair work is completed, Thomas sends another email to all the application deployment owners, informing them that all affected virtual machines are now back online.