IBM Support

Some Basic TO2 (Collection Support) Diagnostics

Technical Blog Post


Abstract

Some Basic TO2 (Collection Support) Diagnostics

Body

Sometimes when you experience a problem that is related to collections (TO2), or file system use of collections, the dump that you obtain or the error messages that you receive contain limited information.  In these situations, you open a PMR with the dump and/or error messages, and after we look at it in the lab, we usually come back to you with a request to enter some diagnostic commands.

While the commands we ask for might vary depending on the situation, if you tried to enter them on your own and include the responses in the initial problem statement in the PMR, that might save us some time.  So here is a general description of what kind of diagnostic commands you should try to enter the next time you get either a dump or error message about collection support.  This includes file system panic dumps about collections (where a collection PID is included in the dump.)

(1) For starters, if you are aware of any PID associated with the problem, see if it matches any of the system named collections that belong to the collection data store.   The datastore name is one of the fields within the PID itself. For file system collections, the datastore name is usually IFSXBSS (or IFSX followed immediately by whatever the name of your subsystem is, if you use multiple subsystems).


So try entering these commands:

ZBROW QUAL SET DS-IFSXBSS
>>>>> This command will make all other ZBROW NAME and ZBROW COLLECTION commands operate against the IFSXBSS data store.
Change the data store name if you are dealing with a different data store.

ZBROW NAME DISP ALL
>>>>> See what all the named collections are for that data store        
and see if any of them match the PID associated with the error.  If none of                    
the named collections have a matching PID then enter the following                                               
command to assign a name to the PID associated with the error
(I use BADPID but you can choose almost any name you would like):

(2) ZBROW NAME DEFINE BADPID PID-pid                                                                
>>>>> Where pid is the PID associated with the error.

(3) Then for either the matching name you found in (1), or for any name you had to define for the PID on your own, please enter these commands and save the output:

ZBROW COL DISPLAY BADPID
ZBROW COL VALIDATE BADPID
ZBROW COL DISPLAY BADPID ATTR

>>>> This will give back a lengthy display that will require you to enter several ZPAGE commands.

Once you are done with entering these commands for a given PID whose name you assigned, you can remove the assignment of the name as part of "cleanup" by entering these commands: 

ZBROW QUAL SET DS-datastorename

ZBROW NAME DELETE BADPID

NOTE: In my example I still used BADPID, but if it is system collection from ZBROW NAME DISP ALL in (1) that has a problem, you would use that name instead.  For example, if you had a problem with DS_SYSTEM_DICT, you would enter:

ZBROW COL DISPLAY DS_SYSTEM_DICT
ZBROW COL VALIDATE DS_SYSTEM_DICT
ZBROW COL DISPLAY DS_SYSTEM_DICT ATTR


If you have more than one PID associated with an error, please enter these 3 commands for each PID.

(4) For any file address associated with the original error, as well as for any file address associated with an error reported from the ZBROW COL VALIDATE command in (3), please enter these commands:

ZBROW DISP FA fileaddress
ZDFIL fileaddress 0.FFF
ZDFAI fileaddress


(5) If you are dealing with a problem for one of the collections in the IFSXBSS (file system) data store, here are some helpful hints:

If you are dealing with an un-named BLOB (BYTEARRAY) collection (meaning it does not match any system name from ZBROW NAME DISP ALL and you had to assign a name to it), chances are that the collection represents a file in the file system.

If you are dealing with an un-named DICTIONARY collection, chances are that the colleciton represents a directory or sub-directory in the file system.

(6) If you are dealing with a file system collection and you know the #INODE record number, you can might be able to determine exactly which directory or file in the file system is represented (we also say "backed") by that collection.  

Try entering this command and routing it to a file in the file system. (In my example, I sent it to /tmp/listAllFiles.txt but you can choose any name you like):

zfile ls -alRi / > /tmp/listAllFiles.txt

Then FTP the file to another platform, such as linux or your work station, where you can view it.  

The #INODE numbers of every subdirectory and file in the file system located by the zfile ls command will be in the output file.  The only trick is that these numbers will be in decimal, whereas if you got the #INODE number from ZDFAI, it will have been in hexadecimal.  So you need to convert from hex to decimal to locate the correct directory or file.

If you think you found the file or (sub) directory that is represented by the collection, you can double check it by displaying the #INODE record:

ZDREC LINODE.nnn 0.FFF   
>>> Where nnn is the record number in HEX (not decimal!!).  If you found the right #INODE, the PID will be either at displacement hex 20 (decimal 32) into the record, if it is the primary collection that backs up the file or (sub) directory, or at displacement hex 110 (decimal 272) if it is the backup collection that backs a (sub) directory. (Only directories and sub-directories can have backup collections.)

(7) One last trick that might come in handy.  When you analyzing an error for a given PID, if you look at the second word of that PID, it contains the timestamp of when the PID was created.  Knowing when the PID was created will not tell you when it was last modified or when it became corrupt, but it could come in handy in helping you know how long the collection was around and how far back in your console logs you might want to look to see when the first TO2 dump or error message that could have been associated with that PID occurred.

For example, if the PID is:


0302FC16 CD2FFB7D C9C6E2E7 C2E2E240
00000000 181324E4 00000000 181324E5


The timestamp is CD2FFB7D.  To tell when this time was, enter
this command on your system:

zudfm tod CD2FFB7D
UDFM0501I          TOD CLOCK VALUE DISPLAY
  CONVERSION OF TOD-CLOCK VALUE CD2FFB7D00000000 :
  ------------------------------------------------

  DATE:  WED, 21 MAY 14

  TIME:  16.11.02


As I said at the beginning of this post, this may not be an all-inclusive list of problem diagnostics you can run whenever you have collections problems.  However, if you try your best to use these commands to collect information and send us the results when you first report the problem to us, you might end up saving us all a bit of time and hopefully we can help you find a resolution to your problem quicker.

I hope you find this post helpful.  If you have any questions about it, feel free to just post a comment and I will answer you.

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSZL53","label":"z\/Transaction Processing Facility (TPF)"},"Component":"","Platform":[{"code":"PF036","label":"z\/TPF"}],"Version":"All versions","Edition":"","Line of Business":{"code":"LOB35","label":"Mainframe SW"}}]

UID

ibm16213658