Note: this Python script creates HMC Sessions but does not log off so they build up over times
- The solution to that is covers in a new AIXpertBlog here: Avoiding HMC REST API Session Logoff Issues
The main AIXpert Blog that explained everything but the code can be found here:
Detailed Walk through of PCM_SSP.py version 23
- Note: In is example code, simplicity is more important than good Python style hence no functions are defined (most of the code it called once)
- We could make most of this in to a Python module at some point
Setup and Libraries
Line 1: Force the script to use Python 3 and not 2.7
Lines 2 to 12: Import the library code as per regular Python code. Some are obvious.
Requests is used to make RES API calls to the HMC and we include the extra feature to switch off complaints about security. This could be fixed but this example it not about security cerificates so we avoid it here,
xml.etree.ElementTree is a Python package to manipulate XML format files
argparse is used to handle the command line arguments in a simple way
json is a Python package to manipulate JSON format files
Parse the Command line arguements
Lines 14 to 26: This code sets up argsparse with the command line arguments, their names and help information, if the user used -? these are output. Also there is some default actions if the user does not use the command line options.
Line 28: Asks agrspares library to process the arguments.
Lines 29 to 40: Savet he results in to Python variables for use later
Lines 42 to 57: Check the options . Some are mandatory. Some have a default if not set
Lines 58 to 63: Commented out as the parser handles these now.
Login to the HMC
Lines 65 to 67: This switches of the warnings that the security certificates have not been checked - to simplify this code example.
Lines 76 to 78: These lines perpare the data from the REST API call to the HMC. A HTML web request is to be sent to the HMC in 3 parts. It is like a webpage.
1. There is the URL that contains the machine hostname, port number and and server file.
2. there is a header (sort of preamble to the webpage. the logonheaders specify the content of the main body and the format of the data "application/vnd.ibm.powervm.web.xml" and that the request if for a login "type=LogonRequest".
3. There main part of the request called here the logonPayload which includes many mysterious parts and the username and password given by the user on the command line.
Lines 80 to 85: The actual request is sent here to the HMC to log on via the requests.put() function
Like a Webpage a return code of 200 means OK. If we have something else we report the information we have and exit from the Python program. We can't proceed if the logon fails.
Lines 87 to 90: The data returned is OK so we need to get a particular authorisation key (sometimes called a session code or Token) out of the response that is used in every subsequent interaction with the HMC
The data (just a string of characters) is converted to XML format so Element Tree and be used to find the session key
The authorisation key is saved in the sessionToken variable
Lines 91 to 92: If we have debugging switched on the sessionToken is printed to the screen
Ask the HMC about the Shared Storage pools that it knows about
- Lines 108 to 110: Again we prepared the request Header, URL and Body
- Note the header includes (like all subsequent HMC requests the SessionToken to prove we Logged in)
- Note the URL includes what we want from the HMC "rest/api/pcm/perferences/SSP". PCM is the Performance & Capacity Metrics interface the preferences are a mixture of control flags and SSP details.
- Line 111: is the call to the HMC. Here we are requesting information so its a GET operations request.get() and there is no body to send.
- Line 112: check the response is 200 = OK and if not output to aid problem determination:
- what the code sent to the HNMC
- the actual return code
- the response (they often include text to explain what went wrong or which bit the HMC rejected).
- Lines 118 to 126: If the user requested that data returned from the HMC is saved the the code
- Create a file name including the date and time in the directory from the command line
- Open the file for writing
- Write the response to the file and close it
- Lines 128 to 131: Again we convert the response which is a large block of characters (with useful line feeds so you could read it) in to XML format for Python to be used to find the sections and subsections we need to locate the data we need later.
Preparing to set the Preferences - only needs to be done once!
- Lines 133 to 188: Only get executed if the user on the command line (--perfs) requested that the Preferences get updated and the main point is to switch on two flags which request SSP I/O stats collection are switched on from the VIOS(s) and saved on the HMC of the Python command to collect later.
- There are multiple transformations needed to prepare the data to set the Preferences.
- Line 143: save the get Preferences response to postBody
- Lines 146 to 150: replace these strings with nothing i.e. remove the strings. This is because they are confusing, pointless and in a mixed order. It is best to clean them out.
- Lines 153 to 156: These take the two flag setting of AggreationEnable and MonitorEnabled and force them from "false" to "true".
- I am not sure if you only want to use the ProcessedMetrics if you need to turn AggreationEnable to true.
- Lines 158 to 165: The whole of the get Preferences can't be posted back to the HMC. We have to "cut off" the first and last few lines.
- The postBody.find() returns the offset of the first character of the string
- Then the line postBody[ i: ] removed all the characters up to that i position
- Then the line postBody[ :i] removes are ccharacters starting in that i position
- Lines 167 to 173: If requested by the user the newly formatted Preferences are saved to a file - this can aid problem determination if it later fails.
Post the formatted Preferences to the HMC to set the flags
- Lines 175 to 176: The request is prepared. the header includes the Preferences which are in XML and, of course the SessionToken is needed.
- Line 177: This exchange with the HMC is sending data to the HMC that we want actioned so its a POST
- Lines 178 to 185: the normal check the request worked OK and if not print to the screen the details.
- Lines 186 to 187: are printed if we are not going to set the Preferences.
Walk the Preferences XML to find the names and id's of the SSP this HMC knows about
- The Preferences XML is highly structured in to section and subsections
- <ManagementConsolePCMSSPPreferences> <- think of this as the whole HMC level
- <ManagementConoleSSPPreferences> <- At this level we have a sequence of sections one for each SSP
- At this level we have information on the SSP like its name, cluster and flags
- We also have SSP as in whole pool I/O stats
- Then further subsections
- Nodes = all the I/O for each VIOS
- Tiers - if not using tiers there is on called SYSTEM
- Failgrp = Failure Group (mirrors), if you have them
- hdisks - Note: you end up with every hdisks reported from each of the VIOS = lots
- Say the pool has 32 LUNs, mirrored = 64 LUNs reported by 24 VIOS that means 1536 hdisks. Which is impossible to graph.
- But it might be useful in problem daignositcs.
- In our Python code we are only covering
- Whole Pool I/O Stats to measure how busy the Pool is during the day and
- VIOS stats to see which are the busy VIOS (and so machines) that might give us clues about load levelling by moving VMs around.
- So that is the Red and Brown levels
- The nested For statements look for the start of a section and the for loop through multiple instances (if applicable)
- The line 208 starts the for loop going through each of the Shared Storage Pools known to this HMC.
- Line 209: detects the XML section within the SSP with a section tag (named ) 'ClusterName' and the following line saved the contents (ssp.text) in variable ClusterName.
- The subsequent lines save other interesting attributes from the SSP section.
- The important attribute for later is the SSPuId - this must be used to request the SSP I/O filenames of SSP I/O data.
- Line 221 Checks that we actually found the details of a SSP - this is a sanity check that we understand the XML.
- Line 222: Prints what we have found on the screen to give feedback to the user about the SSP's as we run an perhaps new SSP they are not aware off.
- Line 224: If we specified a particular SSP name it is checked here or if sspNeeded == ALL then we enter the code to request the data files.
The actual data will be in one or a series of files so we need to request the filenames to be requested later
- Lines 232 to 233: Prepare the request data for a GET request (so no Body). Once again here is the sessionToken plus a hint that we expect a atom.
- This "atom" is data format where we get a list in XML of data files which are in term JSON files. This is a convention used in may websites (so I am told).
- The URL states what we want = SSP date + the SSP unique Id + the type of data type: Raw, Processed or Aggregated.
- Line 235: Lets the user know what the code is attempting.
- Line 236 is the request.get() to interact with the HMC.
- Line 237: As normal checks the return code and 200 = OK
- Line 238 details the failure.
- Lines 240 to 241: Error code 204 suggests that the user has not used the --perf option to switch on the SSP I/O stats monitoring so that is highlighted.
- Line 242: If debug output was set then the response is output on the screen - it can include some details to indicate the issue.
- Lines 244 to 245: if requesting filenames of SSP data failed for this SSP then we move on to the next - it can happen that some SSPs have data and some not or don't yet because we have not waited long enough after switching on stats.
- Lines 246 to 250: Save the resulting data to a file, if asked for on the command line with --save.
- Lines 251: Here we start processing the file of filenames
- The format of the XML is
- <title> <- This is the file name, which the code uses if we save the JSON file to a file
- <link> <- This is the full long URL that is used to request the file
- Lines 261 to 263: If debug is on then it outputs the details the code extracted
Next request to the HMC the actual data files using the URL / filenames and extract the basic SSP details from them
- Finally, we get the data.
- Lines 270 to 271: This is a simple request.get() and we extracted the URL from the file of filenames. The header is use the sessionToken and the URL asks for the data the HMC told us was ready.
- Lines 272 to 276: Are the usual checking of the return code and exiting if its wrong - as the URL was generated by the HMC "What could possibly go wrong!"
- Lines 277 to 281: saves the data to a file is the user used the --save command line option. This can be very helpful as with out a file to look at its impossible to know the file structure and how to extract the SSP.
- Note: JSON for crazy reason IMHO has no linefeed characters. It is all on the first line - I have seen this crash editors as the files can be huge like 100 MB. Admittedly, small for SSP data.
- To reformat to a readable state use: python -m json.tool <JSONfile >readablefile.
- Note however, that this adds linefeeds and spaces to indent the data for reading and can make the file 3 to 5 times larger.
- Line 284: Convert the data returned in to a json format for Python to examine using the json library function json.loads().
- For line 286: The code can examine ProcessedMetrics files and AggregratedMetrics files - they have the same format but the Aggregated also include Min and max data in each samples - which is completely ignored in this code.
- JSON data might on a quick look might seem line XML. Sure it is structured but very differently and includes arrays of items inside square braces "[" and "]".
- Lines 287 to 292: This gets the heading like information about the data file content
- How to read: sspName = data["sspUtil"]["utilInfo"]["name"] ?
- This reads look in the JSON data for a section named sspUtil and in that section
- look for a sub section named utilInfo and within that
- look for a subsection called name and extract the contents (i.e. the data)
- returning it and put the data in the variable called sspName
- Lines 294 to 297: Print the results to the screen to inform the user what data the code found.
- The code saves the data in two formats depending on the user command line option --googlechart is used or not.
- Comma separated values (CSV). No --googlechart command line option
- Lines 300 to 303: create the file either CSV or Google ready and then open this file. This makes it clearer what format is in the file than using a single filename.
- Lines 305 onward: These cover the case where the user has also requested the VIOS level I/O stats with the --vios option
- Lines 306 to 310: decide the filename to append the VIOS into depending on the command lien option and open the file for appending to the end mode = "a"
Extract the SSP level data in each SSP I/O stats samples and writes them in two possible formats
- Lines 312 to 314: declares and sets a few variables used below.
- Line 315: This statement finds the first of an array of stats called utilSamples in the sspUtil section. The for will then loop round each of the array items with the variable sample referring it each new set of stats in turn.
- Line 317: increments samplecount so we can inform the user at the end the number of samples found. Normally, this is 240 (every 30 seconds to 2 hours) but for the first two hours after switching on the SSP stats collection the number of available slowly builds up.
- Lines 318 to 326: Extract from the JSON file and the current sample the data we want to graph. There are other stats that are less interesting but might be used for problem diagnostics like I/O error counts and timeouts.
- Lines 328 to 329 If the user used the --show command line option output briefly the stats found.
- Lines 330 to 335: write to a buffer the two different formats (CSV and Googlechart ready) and then write the buffer to the already open file in append mode.
If requested by the user (--vios) the code goes one level deeper and collect the the individual VIOS KB/s
- Lines 337 to 408: This whole section of code and majority of the next code section deals with the VIOS I/O stats.
- Lines 338 to 339: Inform the user we are handling the VIOS(s)
- Lines 340 to 343: Initialised a variable and three arrays.
- Lines 344 to 365: Is a for loop looking through this sample and its VIOS(s). Here they are called nodes and are found in a sub-section of the each sample i.e. a single sample at a particular time with have one node subsection for each of the VIOS(s) in the SSP.
- Line 346: Increment a count of the VIOS(s) which we tell the user later.
- Line 348: Extract the hostname of the VIOS "name" and add to the nname array using the nname.append() function.
- Lines 350 to 352: Extract other data like the poolState (UP or DOWN) as reported by the cluster -status -verbose command on the VIOS and the read and write Bytes,
- Note as there can be loads of VIOS(s) we focus only on one to highlight which are the busy VIOS(s) in the SSP i.e. the KB/s.
- Line 353 to 364: Are commented out and used for code development - some of these are interesting but will NEVER change like the Machine Type and Model and Serial number (mtms) the VIOS is on. It is rather pointless giving us this data for every VIOS in every Sample.
- Line 366 to 382: The number and name of the VIOS for a particular SSP can change over time and we need to save their order if we want to graph the stats.
- These lines of code for loop around the node names (nname array) and generate the graph heading lines for each column of data and for read and write stats.
- The text is written to the buffer and appended to the file on line 382 in one go.
- This is written every time the Python code is run the SSP_googliser script removes duplicate header lines later.
- Lines 384 to 402 (below): Formats the data and time (depending on the CSV or Googlechart style) and then added the read and write stats to the buffer before appending it to the output file.
- Note the write stats are made a negative number - this will give us the Ice-burg graph with reads above the line and writes below the line so we can compare the read:write ratio visually.
Save the VIOS level data, then a few notes and closing the data files as we are finished with them
- Lines 403 to 407 are a comment that if you want to add tier, failgrp (mirror) or even disk stats this is the place to add then but consider the data volume will increase and may be not suitable for graphing.
- Lines 409 to 411 close the output file or files if we opened the VIOS data file.
- Lines 412 and 413: Output to the user a summary of the data found and useful feed back.
The final section covers the other data format = RawMetrics which are very different.
- Lines 416 to 449 cover the RawMetrics extraction and saving to a file in comma separated value format.
- The data structure is rather different to the ProcessedMetrics but the numbers roughly the same
- It is include here so that if you decide to use RawMetrics you have a simple worked example to follow and you can see examples of the actual data.
- It includes different ways it reports error - none of this is documented and I can only show work around the errors I stumbled across (they made be other).
- Line 451: this is output if you selected just one SSP for which you wanted SSP I/O stats and it found another SSP
- Line 452 is the final line of the Python code and lets you know it exited normally.