IBM Support

MUSTGATHER: Collecting data for hardware components and preparing for a service call to your IBM Cloud Pak System

Troubleshooting


Problem

 This document contains a list of the required actions for a IBM Cloud Pak System administrator to gather information for IBM to investigate a problem and to prepare for any potential service calls. 
Note: This document was formerly titled: "MustGather: Chassis Management Module, IBM Storwize V7000, Top of Rack Switch, SAN Switch, Compute Node, and Management node problems for the IBM Cloud Pak System"

Cause

Follow this document to reduce investigation time.
 

Diagnosing The Problem

IBM requires screen captures from the IBM Cloud Pak System user interface, export of the events, collection set files, address confirmation information, and the key, to start problem investigation. 
Included are steps that describe how your company's administrators work with the IBM SSR (CE) when a repair service is required. The administrators confirm with the IBM SSR that the system is healthy and operational.

Resolving The Problem

Step 1: Obtain the events, screen captures, and collection set for the component reporting the event:

Step 1: Obtain the events, screen captures, and collection set for the component reporting the event.  All the following information is required.
Step 1.1: Navigate to the IBM Cloud Pak System User Interface for the hardware component.
Navigate to Hardware > Infrastructure Map.
On the upper left corner of the page, click "Switch to Tree View".
Click "hardware component name".
Step 1.2: Capture the screen image of all the information from the IBM Cloud Pak System user interface for the component.
Use the component page found in Step 1.1. Expand all sections on that page.  With all sections expanded, copy screen images into a document or directly into the case notes. Include all data, all sections expanded and all table entries. Multiple screen images need to be captured.  If you created a file, upload the file to the case.  
Do not highlight or mark on the screen image. 
Step 1.3: Export the errors listed on the component page.
At the top of the page, you see "Error", followed by a number.  
image-20230706181300-2

If the number is 0 report that in your case update.
If the number is greater than 0, click the number.  Select all the Errors for this component.  There is a line of icons - hover the cursor over the icons.  Click the "circle and arrow" icon to export the errors to a file. 
image-20230706182148-4

Do the same steps for the number after "Warning", to also export the warnings to a file.  

Do not export events from any other pages.  The component pages show errors and warnings specific to the component.
Upload the export of the errors and warnings to this case. All the Errors are one file and all the warnings are in another file.  
Step 1.4: Generate and upload the logs for the component.
Scroll down and click the generate logs button.  Follow the prompts and upload the collection set to the case.  

image-20230621170211-1
Proceed to Step 2.

 

Step 2: Complete and return the required "Onsite Contact and Datacenter location" form:

Step 2: Complete and return the "Onsite Contact and DataCenter Location" form:
Copy and paste this "IBMCPSDCform1" into your MySupport case.
All fields in a, b, and c are required to be accurate and complete.
<< copy and paste start IBMCPSDCform1 >>

a. If parts are required for this case choose one of the 2 options listed below:
Your choice ____ 
(If not answered, parts will be shipped to your datacenter contact.)
 
  1. IBM CE/SSR to bring parts.
  2. Ship part to our datacenter onsite contact for IBM CE/SSR to install. 
b. If an IBM CE/SSR is needed, enter the days of the week (Monday/Tuesday/...)  and times when an IBM CE can access the data center. 
Hours: ________ Days: ________ (If not answered, the default is 8am to 5pm, Monday to Friday in the time zone where the system is installed.)   
c. Address and onsite contact details
 
Company name that owns the system:
IBM CPS Serial number: 
----
Client Onsite Contact (You company's representative located at the datacenter) 
Name:
Telephone:
Email: 
----  
Data Center Location:
Company name:
Street:
City:
State or province:
Country:
Postal code:
 
<< copy and paste end  IBMCPSDCform1 >>

Required: The phone number and datacenter contact must be in the country where the IBM Cloud Pak System device is physically located. 
Proceed to Step 3.

Step 3: Enable "IBM service representative account access" and send in the key:

Step 3: Enable "IBM service representative account access" and send in the key:
If an IBM SSR is needed, IBM support provides the IBM SSR with a password from the key you send. When the key is not provided, the service call is delayed.

Paste the key into the case or attach a document with the key to the case.

Here are the steps
  1. Navigate to "Problem Determination >  System Troubleshooting" or "System > System Troubleshooting".
  2. Click "Enable IBM service representative account access" check box.
  3. Click "Generate" to create a "key".
  4. Copy the key from the "Secret Key" box as text (not a picture or image).
  5. Paste the key into a document. Attach to the document to the MySupport case.
This key generates a password given to the IBM SSR by IBM support. The password is not provided to our clients.  This password is used by the IBM SSR to access the "service console" on your physical system.
Proceed to Step 4.

Step 4: Prepare for a Potential Service Call:

Step 4: Prepare for a Potential Service Call
When onsite service is required the IBM SSR schedules directly with the contacts provided in Step 2 of this document.

Plan for service calls.

Your company's business solution (workloads) owners and system administrators prepare the cloud environment for repair activities. 

Prepare for a service call in advance.
Plan to communicate with the IBM SSR and your team during the repair.
Complete any needed onsite access documents when the IBM SSR calls your designated local contact. 
Remember to resolve any questions on the repairs with the IBM SSR in advance of the service window

Your company's Cloud Pak System system administrators prepare the system for service, for example, but not limited to:
  • Important: Follow your company's instructions for the applications running in the cloud environment.
  • Repairs might require a power-off of a component. Consult the IBM Documentation for how to prepare your system
    • There are links to each specific product's documentation.
    • IBM Cloud Pak System W3500 MT 8558 
    • IBM Cloud Pak System W3550 MT 8564 
    • IBM Cloud Pak System W3700 MT 8536
    • IBM Cloud Pak System W4600 MT 4600
      • Here are a two examples:
        • Compute node:
          • See the preparing for a repair to a compute node topic.  Look under "Administering Compute nodes" in the table of contents or search for power-off compute nodes
          • Let the IBM SSR know when "Power status" field indicates the power is off.
        • Platform System Manager (PSM):
          • See "Administering Management nodes" in the table of contents or search for power-off management nodes.
          • Navigate to "Platform System Manager" component user interface page: Hardware > Management Nodes.
          • Click the entry for "Platform System Manager", which needs to be repaired.
          • Look at the Type field. If the field shows:
            • "Platform Systems Manager - Primary ..."
              • Wait for deployments or other actions that require access to the Cloud Pak System User Interface to complete.
              • When this management node is powered off, you lose access to the User Interface for 30 to 60 minutes. The business solutions on the cloud are not otherwise affected while the other PSM takes the leader role.
            • "Platform Systems Manager ..."
              • This management node is the "non-leader" PSM.  You can power off this PSM.
            • The power-off and power-on options are on the upper right of the page. Power-off the PSM node. 
When the IBM SSR arrives
  • The IBM SSR contacts your team before they start the repair.
  • Link the IBM SSR with your team to communicate during the service call.
  • Your team disables "Call Home".
    1. System > System Settings, expand "Service and Support Manager"
    2. Click the 'pencil icon' for "Service and Support Level"
    3. Remember your choice (save a screen capture for your records)   
    4. Click the third selection "Do not collect troubleshooting information and do not open a service request. The administrator will open a service request and collect and post the log files manually."
    5. Click the OK button.
  • The IBM SSR asks your Cloud Pak System system administrators, and data center team, to confirm the repair action can start.
  • The IBM SSR repairs the component. 
  • Proceed to Step 5.

Step 5: Before the service call is complete, check the system's health with the IBM SSR, and disable IBM service representative account access:

Step 5: Before the service call is complete, check the system's health with the IBM SSR, disable IBM service representative account access:
Confirm the problem is resolved. 
Your company's team and the IBM SSR agree the repair is complete before the service call ends. Here is a list to get you started:
  • Clear any events for the component for example events on the storage node.
  • Use the Health check report described in "Health Checks and Introduction to Troubleshooting on a Cloud Pak System System"  to verify the problem is resolved.
  • Navigate to the component Hardware > "Component", look for any number after Error or Warning, click the number to see if there are any new problems. Scroll though the page.  Check all is as expected.
  • Work with the IBM SSR on any concerns.
  • Clear any events and problems in the Cloud Pak System System's User Interface related to the repair. Otherwise the system could create call home records when the system "Call Home" enabled.
    • System > Events.  Hover over the icons on the row for each entry to find the close icon which looks like an "X".  Close events appropriately.
    • System > Problems.  Hover over the icons on the row for each entry to find the close icon. Close any problems as appropriate.
    • Confirm the system is ready for your company's users.
    • Enable  "Call Home"
      1. System > System Settings, expand "Service and Support Manager"
      2. Click the 'pencil icon'.   
      3. Remember the previous choice and reset the selection.   
      4. Click the OK button.
  • Disable disable IBM service representative account access.
    • Navigate to "Problem Determination >  System Troubleshooting" or "System > System Troubleshooting".
    • Uncheck "Enable IBM service representative account access".

[{"Type":"MASTER","Line of Business":{"code":"LOB67","label":"IT Automation \u0026 App Modernization"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSTQRPX","label":"IBM Cloud Pak System W4600 Commercial for VMware"},"ARM Category":[{"code":"a8m0z000000cwm2AAA","label":"Product Components"}],"ARM Case Number":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"},{"Type":"MASTER","Line of Business":{"code":"LOB67","label":"IT Automation \u0026 App Modernization"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSJPC5","label":"IBM Cloud Pak System W3700"},"ARM Category":[{"code":"a8m0z000000cwm2AAA","label":"Product Components"}],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"},{"Type":"MASTER","Line of Business":{"code":"LOB67","label":"IT Automation \u0026 App Modernization"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSZQFR","label":"IBM Cloud Pak System W3500"},"ARM Category":[{"code":"a8m0z000000cwm2AAA","label":"Product Components"}],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"},{"Type":"MASTER","Line of Business":{"code":"LOB67","label":"IT Automation \u0026 App Modernization"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SS97LW","label":"IBM Cloud Pak System W3550"},"ARM Category":[{"code":"a8m0z000000cwm2AAA","label":"Product Components"}],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"}]

Document Information

Modified date:
14 February 2024

UID

swg21666454