Troubleshooting
Problem
Many customers are concerned with the amount of time it will take to copy off a MSD. This document provides ways to help manage and shorten the Main Storage Dump (MSD) copy time.
Symptom
Copy MSD will take too much time.
Environment
IBM i
Resolving The Problem
It is difficult to predict the amount of time that it will take to copy off a Main Storage Dump (MSD); however, this document contains some information on possible ways to help manage and reduce the time required.
MSD Enhancements/PTFs
There have been changes made to MSD processing to improve usability and reduce the copy time.
The following PTFs are recommended and include the latest smart dump enhancements/fixes:
o V6R1M1 MF59433
o V7R1M0 MF61968
o V7R2M0 MF59427
o V7R3M0 No PTF needed included in base OS
Note that it is important to keep current on PTFs in general to ensure you benefit from any recent IPL/recovery time improvements.
Monitoring progress during a MSD
With IBM i 7.2 or with the PTFs applied, the IPL status screen will be the first screen that will be displayed after a MSD is underway:

You can monitor the progress of the following MSD IPL steps on this screen or by using System Reference Codes (SRCs) displayed on the HMC for this system/LPAR.
The most significant initial steps/SRCs that may take a few minutes to complete are as follows:
C6xx4205 - Synchronization of mirrored data
For mirrored disk configurations, you can monitor the progress of this step by observing the xx in the SRC C6xx4205 which will change to indicate the percent completed of the mirroring resynchronization. If the MSD is interrupted during this step, the mirroring resynchronization will be restarted from the beginning on the subsequent IPL. To avoid the time required for this step, consider alternatives to disk mirroring or see the “Delay Mirror synchronization” topic under the Advanced Options section below.
C600424x - Reclaim main storage
During these steps, changed pages in memory are written out to disk. This helps avoid damaged objects and reduces the recovery time during the subsequent abnormal IPL. Therefore, it is very important to allow the few minutes for this to be performed and not interrupt the MSD while these steps/SRCs are in progress.
C6004250 - Storage management subset directory recovery
The operating system’s storage management directories need to be validated and recovered before the MSD can be copied to disk. This step should typically take only a few minutes, but may take longer under some conditions. An enhancement is included in IBM i 7.2 to reduce the time for rare cases of exceptionally large SM directories. (Note: If SRC C6004260 is ever displayed, it means a Full directory recovery is being performed which requires significantly more time to complete.)
With IBM i 7.2 or with the above PTFs applied, the MSD summary screen will be the next screen to be displayed after the steps above have been completed, along with a “Main Storage Dump IPL complete” message:

Note: Prior to the changes in 7.2 or without the PTFs, when this screen is first displayed, the IPL complete message may not appear until later or until a key is pressed.
Assuming the system is configured to automatically copy the dump (which is the default setting), the copying of the MSD to disk will have started and one of the following SRCs will be displayed on the HMC indicating what type of dump is being copied:
C6xx1404 - Copying a compressed Full dump
C6xx2404 - Copying a uncompressed Full dump
C6xx3404 - Copying a uncompressed Subset dump
C6xx4404 - Copying a compressed Subset dump
Note: On 6.1 without the 6.1 MSD PTFs, SRC C6xx4404 is always displayed, and there is no way to distinguish what type of dump is being copied.
You can monitor the progress of this step by observing the xx in the SRC C6xx#404 which will change to indicate the approximate percentage of the dump that has been copied to disk.
Knowing the type of dump being copied can help determine whether the copy will be completed relatively quickly or may take an extended amount of time (especially for large main storage configurations). A Subset dump (also called a “smart dump”) is typically less than 10% the size of a Full dump and therefore much faster and less time to dump than a Full dump. With the latest MSD PTFs, the system is optimized to attempt a subset dump in most cases and try to avoid the longer time required for a full dump. To further customize the default system behavior, see the topics under the Advanced Options section below.
While the copy is in progress, you can also use the MSD screens to monitor the approximate percentage of the dump that has been copied to disk. To see this information, press Enter on the MSD summary screen ("Main Storage Dump Occurred" screen), select "Work with current main storage dump (MSD)" on the MSD Manager screen, and then press F11=Copy status:


Message you will receive when F10 is pressed and an IPL is already complete:

(If the IPL has not completed yet, pressing F10 takes you back to the IPL Status screen.)
Message you receive when F11 is pressed and a copy has not been started:

Copy Status screen when F11 is pressed and a copy is underway:

If you back up to the MSD summary screen and press F3 or F12, this is what you will view:

A second F3 will start a new IPL which is quick and all you will see is the MSD summary screen and a message on line 21 >>>> ' IPL in progress. Please wait. '
Then the console goes blank.
MSD Options Screen at R720:

MSD Options Screen at R730:

Addition in formation is available in IBM Documentation: https://www.ibm.com/docs/en/power10/9028-21B?topic=dumps-copying-main-storage-dump
Dump copy options:
If the auto copy option is enabled when a main storage dump occurs, the current MSD will automatically be copied to auxiliary storage, followed by an automatic reIPL if the control panel is not set to Manual mode.
The force compression option is used when copying the dump to the system ASP. The default is 0, the system will determine based upon space and performance of the ASP. When set to 1(Yes), the copy will always use compression. When set to 2(No), the system will avoid using compression during the copy. So the MSD data is not compressed.
Note: Main store sizes of 200 GB or more will always be compressed.
A subset is a dump copy in which the system discarded some main storage data judged to be irrelevant to solving the problem associated with the dump. If the current dump is found to be similar to an unsolved dump, all data is potentially relevant, so no main storage data will be discarded during the next copy. This is a full dump.
The subset only option is used to over ride the next auto copy full to the system ASP. The default is 0(no) override, the system will auto copy a full dump. When set to 1(yes), the normal full copy will be not occur but only a subset will be copied.
Advanced Options
MSDOPTIONS macro (macro is no longer available after R730 the Main Storage Dump options screen must be used to modify MSD behavior)
An Advanced Analysis macro can be run to override how the system processes a Main Store Dump (MSD) after a system failure. This allows you to customize the MSD IPL recovery to better match your business downtime policies and risk tolerance. Specifically, it allows you to override the default MSD behavior by selecting one or both of the following new options:
"subonly" - Always perform a Subset dump (aka "Smart dump") instead of a Full dump.
Subset dump is typically less than 10% the size of a Full dump and therefore much faster and less time to dump than a Full dump. This option would be for a business that decides it can never tolerate the downtime for a Full dump (for this particular LPAR) and accepts the increased risk that the reduced data captured may be insufficient to identify the root cause of the problem the first time it occurs.
Note: While unlikely, a full dump may still be performed if an unexpected error occurs during the subset dump.
"nodump" - Do not copy a MSD on the first occurrence; only copy the MSD if the same failure occurs again.
Minimal diagnostic data will be captured and the system will be re-IPLed as quickly as possible (while still ensuring that any changed pages in main store are written to disk to help avoid damaged objects and improve recoveries). This option would be for a business that decides it cannot tolerate the downtime for even a subset/smart dump the first time a problem occurs and is willing to risk a second occurrence before capturing data to help determine root cause.
Note: With the "nodump" MSD option. If the same failure that caused the MSD occurs a second time and the existing dump has not been deleted, then a Subset dump will be copied; if it occurs a third time, a Full dump will be copied unless the "subonly" option is also set.
To use the MSDOPTIONS macro, perform the following:
Delay Mirror Synchronization
If you have mirrored disk units, the mirror synchronization step on an abnormal IPL can take a long time; however, you can optionally choose to avoid waiting at this step and instead have the synchronization performed in the background while the IPL continues and the system resumes operation.
The philosophy for waiting was to ensure absolutely 100% synchronization before resuming any disk activity; however, you may view that as an acceptable risk in your particular situation and instead opt to always bring the system back up as quickly as possible.
There are two options to avoid this long running step on an abnormal IPL:
To use the MIRSYNCCTRL macro, you should perform the following:
MSD Enhancements/PTFs
There have been changes made to MSD processing to improve usability and reduce the copy time.
The following PTFs are recommended and include the latest smart dump enhancements/fixes:
o V6R1M1 MF59433
o V7R1M0 MF61968
o V7R2M0 MF59427
o V7R3M0 No PTF needed included in base OS
Note that it is important to keep current on PTFs in general to ensure you benefit from any recent IPL/recovery time improvements.
Monitoring progress during a MSD
With IBM i 7.2 or with the PTFs applied, the IPL status screen will be the first screen that will be displayed after a MSD is underway:

You can monitor the progress of the following MSD IPL steps on this screen or by using System Reference Codes (SRCs) displayed on the HMC for this system/LPAR.
The most significant initial steps/SRCs that may take a few minutes to complete are as follows:
C6xx4205 - Synchronization of mirrored data
For mirrored disk configurations, you can monitor the progress of this step by observing the xx in the SRC C6xx4205 which will change to indicate the percent completed of the mirroring resynchronization. If the MSD is interrupted during this step, the mirroring resynchronization will be restarted from the beginning on the subsequent IPL. To avoid the time required for this step, consider alternatives to disk mirroring or see the “Delay Mirror synchronization” topic under the Advanced Options section below.
C600424x - Reclaim main storage
During these steps, changed pages in memory are written out to disk. This helps avoid damaged objects and reduces the recovery time during the subsequent abnormal IPL. Therefore, it is very important to allow the few minutes for this to be performed and not interrupt the MSD while these steps/SRCs are in progress.
C6004250 - Storage management subset directory recovery
The operating system’s storage management directories need to be validated and recovered before the MSD can be copied to disk. This step should typically take only a few minutes, but may take longer under some conditions. An enhancement is included in IBM i 7.2 to reduce the time for rare cases of exceptionally large SM directories. (Note: If SRC C6004260 is ever displayed, it means a Full directory recovery is being performed which requires significantly more time to complete.)
With IBM i 7.2 or with the above PTFs applied, the MSD summary screen will be the next screen to be displayed after the steps above have been completed, along with a “Main Storage Dump IPL complete” message:

Note: Prior to the changes in 7.2 or without the PTFs, when this screen is first displayed, the IPL complete message may not appear until later or until a key is pressed.
Assuming the system is configured to automatically copy the dump (which is the default setting), the copying of the MSD to disk will have started and one of the following SRCs will be displayed on the HMC indicating what type of dump is being copied:
C6xx1404 - Copying a compressed Full dump
C6xx2404 - Copying a uncompressed Full dump
C6xx3404 - Copying a uncompressed Subset dump
C6xx4404 - Copying a compressed Subset dump
Note: On 6.1 without the 6.1 MSD PTFs, SRC C6xx4404 is always displayed, and there is no way to distinguish what type of dump is being copied.
You can monitor the progress of this step by observing the xx in the SRC C6xx#404 which will change to indicate the approximate percentage of the dump that has been copied to disk.
| Important Note: If the MSD processing is cancelled before or during the copy, the MSD will be lost and little to no data will be available to help determine the problem or identify a possible fix to prevent a reoccurrence. |
Knowing the type of dump being copied can help determine whether the copy will be completed relatively quickly or may take an extended amount of time (especially for large main storage configurations). A Subset dump (also called a “smart dump”) is typically less than 10% the size of a Full dump and therefore much faster and less time to dump than a Full dump. With the latest MSD PTFs, the system is optimized to attempt a subset dump in most cases and try to avoid the longer time required for a full dump. To further customize the default system behavior, see the topics under the Advanced Options section below.
While the copy is in progress, you can also use the MSD screens to monitor the approximate percentage of the dump that has been copied to disk. To see this information, press Enter on the MSD summary screen ("Main Storage Dump Occurred" screen), select "Work with current main storage dump (MSD)" on the MSD Manager screen, and then press F11=Copy status:


Message you will receive when F10 is pressed and an IPL is already complete:

(If the IPL has not completed yet, pressing F10 takes you back to the IPL Status screen.)
Message you receive when F11 is pressed and a copy has not been started:

Copy Status screen when F11 is pressed and a copy is underway:

If you back up to the MSD summary screen and press F3 or F12, this is what you will view:

A second F3 will start a new IPL which is quick and all you will see is the MSD summary screen and a message on line 21 >>>> ' IPL in progress. Please wait. '
Then the console goes blank.
MSD Options Screen at R720:

MSD Options Screen at R730:

Addition in formation is available in IBM Documentation: https://www.ibm.com/docs/en/power10/9028-21B?topic=dumps-copying-main-storage-dump
Dump copy options:
If the auto copy option is enabled when a main storage dump occurs, the current MSD will automatically be copied to auxiliary storage, followed by an automatic reIPL if the control panel is not set to Manual mode.
The force compression option is used when copying the dump to the system ASP. The default is 0, the system will determine based upon space and performance of the ASP. When set to 1(Yes), the copy will always use compression. When set to 2(No), the system will avoid using compression during the copy. So the MSD data is not compressed.
Note: Main store sizes of 200 GB or more will always be compressed.
A subset is a dump copy in which the system discarded some main storage data judged to be irrelevant to solving the problem associated with the dump. If the current dump is found to be similar to an unsolved dump, all data is potentially relevant, so no main storage data will be discarded during the next copy. This is a full dump.
The subset only option is used to over ride the next auto copy full to the system ASP. The default is 0(no) override, the system will auto copy a full dump. When set to 1(yes), the normal full copy will be not occur but only a subset will be copied.
Advanced Options
MSDOPTIONS macro (macro is no longer available after R730 the Main Storage Dump options screen must be used to modify MSD behavior)
An Advanced Analysis macro can be run to override how the system processes a Main Store Dump (MSD) after a system failure. This allows you to customize the MSD IPL recovery to better match your business downtime policies and risk tolerance. Specifically, it allows you to override the default MSD behavior by selecting one or both of the following new options:
"subonly" - Always perform a Subset dump (aka "Smart dump") instead of a Full dump.
Subset dump is typically less than 10% the size of a Full dump and therefore much faster and less time to dump than a Full dump. This option would be for a business that decides it can never tolerate the downtime for a Full dump (for this particular LPAR) and accepts the increased risk that the reduced data captured may be insufficient to identify the root cause of the problem the first time it occurs.
Note: While unlikely, a full dump may still be performed if an unexpected error occurs during the subset dump.
"nodump" - Do not copy a MSD on the first occurrence; only copy the MSD if the same failure occurs again.
Minimal diagnostic data will be captured and the system will be re-IPLed as quickly as possible (while still ensuring that any changed pages in main store are written to disk to help avoid damaged objects and improve recoveries). This option would be for a business that decides it cannot tolerate the downtime for even a subset/smart dump the first time a problem occurs and is willing to risk a second occurrence before capturing data to help determine root cause.
Note: With the "nodump" MSD option. If the same failure that caused the MSD occurs a second time and the existing dump has not been deleted, then a Subset dump will be copied; if it occurs a third time, a Full dump will be copied unless the "subonly" option is also set.
To use the MSDOPTIONS macro, perform the following:
| 1. | From the operating system command line, type STRSST and press Enter. |
| 2. | Sign in with a service tool profile and password that has authority to Display/Alter/Dump in SST. |
| 3. | Select Option 1 - Start a service tool, and press Enter. |
| 4. | Select Option 4 - Display/Alter/Dump, and press Enter. |
| 5. | Select Option 1 - Display/Alter storage, and press Enter. |
| 6. | Select Option 2 - Licensed Internal Code (LIC) data, and press Enter. |
| 7. | Select Option 14 - Advanced analysis, and press Enter. |
| 8. | On the Select Advanced Analysis Command screen, there is a list of available advanced analysis macros (under the Command column). Because MSDOPTIONS is not listed, type 1 (Select) next to the top blank line under the Command column. In the blank line, type MSDOPTIONS, and press Enter. |
| 9. | In the Options field, type one of the options below and press Enter (entering a blank will display the current option settings): o -h: Help text. o nodump: Dump override, do not copy a MSD on the first occurrence, except for user-initiated dumps. o dumpon: Default normal dump processing. o subonly: Subset override, subset only, full dump only when needed. o subnorm: Default normal subset dump processing, no override set. The following options are on 7.1 and 6.1.1 only (7.2 defaults to allcomp/subuser): o allcomp: Compression override, always compress dump data. o normcomp: Default compression calculation processing, optimize copy time. o subuser: User initiated dump override, subset dump. o normuser: Default user initiated dump processing, full dump. |
Delay Mirror Synchronization
If you have mirrored disk units, the mirror synchronization step on an abnormal IPL can take a long time; however, you can optionally choose to avoid waiting at this step and instead have the synchronization performed in the background while the IPL continues and the system resumes operation.
The philosophy for waiting was to ensure absolutely 100% synchronization before resuming any disk activity; however, you may view that as an acceptable risk in your particular situation and instead opt to always bring the system back up as quickly as possible.
There are two options to avoid this long running step on an abnormal IPL:
| 1. | Change the configuration so it does not use system mirroring, or |
| 2. | Consider the option which allows the IPL to proceed and do the mirror resync in the background. For R710 and beyond, you can use the System i Navigator task "Mirror Synchronization on IPL" (task ID=mirrorsync). |
| 1. | From the operating system command line, type STRSST and press Enter. |
| 2. | Sign in with a service tool profile and password that has authority to Display/Alter/Dump in SST. |
| 3. | Select Option 1 - Start a service tool, and press Enter. |
| 4. | Select Option 4 - Display/Alter/Dump, and press Enter. |
| 5. | Select Option 1 - Display/Alter storage, and press Enter. |
| 6. | Select Option 2 - Licensed Internal Code (LIC) data, and press Enter. |
| 7. | Select Option 14 - Advanced analysis, and press Enter. |
| 8. | On the Select Advanced Analysis Command screen, there is a list of available advanced analysis macros (under the Command column). Because MIRSYNCCTRL is not listed, type 1 (Select) next to the top blank line under the Command column. In the blank line, type MIRSYNCCTRL, and press Enter. |
| 9. | In the Options field, enter one of the following options and press Enter: o -H: Display help text o -D: Display current attribute values o -SET -IPL NOWAIT: Run synchronization in the background Notes: 1. This option affects all ASPs and IASPs on the partition (in other words, you can not select individual ASPs or IASPs for this function). 2. The MIRSYNCSTS Advanced Analysis macro from SST can be used to monitor the % complete. |
Related Information
[{"Type":"MASTER","Line of Business":{"code":"LOB68","label":"Power HW"},"Business Unit":{"code":"BU070","label":"IBM Infrastructure"},"Product":{"code":"SWG60","label":"IBM i"},"ARM Category":[{"code":"a8m0z0000000C4BAAU","label":"IBM i"}],"ARM Case Number":"","Platform":[{"code":"PF012","label":"IBM i"}],"Version":"7.1.0;7.2.0;7.3.0;7.4.0;7.5.0"}]
Was this topic helpful?
Document Information
Modified date:
07 January 2025
UID
nas8N1020270