How to monitor for issues with dump space on AIX V5.3
The /usr/lib/ras/dumpcheck
command can be used to confirm that AIX system dump space is large enough to hold a stand-alone dump if one were taken at the time dumpcheck is run. If dumpcheck finds issues with dump space, it adds an entry to the AIX error log. Since the dump space requirement tends to grow as the system gets busier, configure dumpcheck to run regularly at a time when the system is likely to be fairly heavily loaded. Use crontab -l to confirm that root's crontab is configured to run dumpcheck regularly at an appropriate time and, if not, use crontab -e to update root's crontab.
If dumpcheck logs an error, please consider that dumpcheck will not necessarily be run at a time when the dump space requirement is at its peak. So allocate more space than dumpcheck says will be required. An uplift of at least 20% is recommended, and more if there is plenty of space available in rootvg.
The AIX error notification facility
can be used to monitor for errors logged by dumpcheck.
A sample Korn shell script is available which configures an AIX error notification exit so that a note is sent to specified email addresses when dumpcheck adds an entry to the AIX error log. The email addresses are specified in a file with a suffix of .emailaddrs, residing the same directory and with the same base name as the shell script. (That is, if the shell script is named dumperr, then the address file must be named dumperr.emailaddrs.) It is possible to modify the sample shell script to take other actions instead of (or in addition to) sending a note.
The shell script accesses a configuration file (dumperr.emailaddrs) in the directory in which it resides, so it is best to put the script is a directory dedicated to error notification (eg, /usr/local/errnotify) rather than a directory such as /usr/local/bin.
When invoked with no parameters, the shell script produces help text documenting the flags it supports:
 | Note
The Mirroring the root volume group article says, "Mirrored dump devices are supported in AIX® 4.3.3 or later." so mirroring a dump device is supported, but there is a reason to avoid mirroring a dump device. If access to an hdisk is temporarily lost, the hdisk ends up in the "missing" state:
Once an hdisk is in the "missing" state, an attempt to sync the volume group containing the hdisk will fail even if access to the hdisk is restored:
The only way to get the hdisk back to the active state is to issue the varyonvg command (even though the volume group is already varied on):
But, as shown above, the varyonvg fails if there is an active dump device on a missing physical volume.
So when AIX rootvg is mirrored, it is better to allocate two unmirrored dump devices (one defined as primary and one as secondary) rather than a single mirrored dump device. That way, the dump device on the missing physical volume can be deconfigured while still allowing a dump to be taken to the other dump device should a system crash occur.
But there is an issue with that approach, as well. The /usr/lib/ras/dumpcheck command will confirm only that the larger of the two dump devices is large enough. If one of the dump devices is too small, if access is lost to the hdisk on which the larger dump device resides, and AIX then crashes, the system dump will fail and diagnostic information will not be collected which is required to determine the cause of the crash.
So when two dump devices are allocated, care must be taken to make sure the primary and secondary dump devices are the same size. |
 | Testing modifications
If the shell script is modified, it is prudent to test the script before putting it into production. The shell script can be tested by using the AIX sysdumpdev command to allocate an AIX dump space which is too small and then running /usr/lib/ras/dumpcheck. When dumpcheck adds an entry to the AIX error log, the script should send a note to the email addresses it finds in the hdwerr.emailaddrs file. If no note is received, please note that it is possible to enable tracing in the shell script by modifying two lines within it:
|
The contents of this web page solely reflect the personal views of the authors and do not necessarily represent the views, positions, strategies or opinions of IBM or IBM management. Please use the
Add Comment link at the bottom of the page to provide feedback. Note: Until you log in (using the link in the upper right corner of this web page), you will not see the
Add Comment link and you can not add a comment. If you do not already have an IBM ID, use the Register Now link on the sign in page to obtain one. Registration is quick and easy.