Troubleshooting
Problem
This document provides troubleshooting information regarding the QHST message queue.
Resolving The Problem
This document provides troubleshooting information regarding the QHST message queue.
QHST The history (QHST) log consists of a message queue and a physical file known as a log-version. Messages sent to the log message queue are written by the system to the current log-version physical file. It is the role of the SCPF job to move messages from the QHST message queue to the log-version physical file. There are numerous errors that can cause the logging to QHST to stop. If the SCPF job has errors or is not getting enough time to run, the messages will not be removed from QHST. This can cause the QHST message queue to extend and eventually fill to the point where no more messages are allowed to be posted to the QHST message queue. Starting at release 7.4, a new system job called QHST moves the messages from the QHST message queue to the log-version physical file. Starting at release 7.4, the SCPF job no longer performs maintenance of the history files.
Command DSPOBJD OBJ(QSYS/QHST) OBJTYPE(*MSGQ) will show the current size of the QHST message queue. QHST, as well as all message queues, can reach a maximum size of 16 MB. When the QHST message queue has been extended the maximum number of times, message CPF2460 will appear that says it has been extended the maximum number of 488 times. This limit is based on the 16 MB maximum size of the message queue.
Under normal conditions, the amount of messages put to the QHST message queue will vary throughout the day. When the queue is at certain thresholds, the QMHLOGER code in the SCPF(QHST in 7.4 and above) job is run to move the messages from the QHST message queue to the QHST physical file. If the system determines it is getting too far behind, it will change the priority of the SCPF job to 0. When this happens, message CPI2413 is logged (run priority for SCPF has been changed to 0.). When the SCPF job catches up, message CPI2414 is logged (original run priority for SCPF has been reinstated), and the priority is returned to normal.
System value QHSTLOGSIZ plays a part in how often a new history log version needs to be created. If that is larger, the system does not have to create a physical file as often. The default value in earlier releases was to create a new physical file every 5000 records, the default value has been changed to *DAILY starting on release 7.3. The value of *DAILY for the QHSTLOGSIZ is the recommended value for any release of the OS.
Some of the common reasons that messages stop logging to QHST are as follows:
What are some potential messages that indicate a problem with QHST?
You should try the following steps when a QHST problem occurs:
QHST The history (QHST) log consists of a message queue and a physical file known as a log-version. Messages sent to the log message queue are written by the system to the current log-version physical file. It is the role of the SCPF job to move messages from the QHST message queue to the log-version physical file. There are numerous errors that can cause the logging to QHST to stop. If the SCPF job has errors or is not getting enough time to run, the messages will not be removed from QHST. This can cause the QHST message queue to extend and eventually fill to the point where no more messages are allowed to be posted to the QHST message queue. Starting at release 7.4, a new system job called QHST moves the messages from the QHST message queue to the log-version physical file. Starting at release 7.4, the SCPF job no longer performs maintenance of the history files.
Command DSPOBJD OBJ(QSYS/QHST) OBJTYPE(*MSGQ) will show the current size of the QHST message queue. QHST, as well as all message queues, can reach a maximum size of 16 MB. When the QHST message queue has been extended the maximum number of times, message CPF2460 will appear that says it has been extended the maximum number of 488 times. This limit is based on the 16 MB maximum size of the message queue.
Under normal conditions, the amount of messages put to the QHST message queue will vary throughout the day. When the queue is at certain thresholds, the QMHLOGER code in the SCPF(QHST in 7.4 and above) job is run to move the messages from the QHST message queue to the QHST physical file. If the system determines it is getting too far behind, it will change the priority of the SCPF job to 0. When this happens, message CPI2413 is logged (run priority for SCPF has been changed to 0.). When the SCPF job catches up, message CPI2414 is logged (original run priority for SCPF has been reinstated), and the priority is returned to normal.
System value QHSTLOGSIZ plays a part in how often a new history log version needs to be created. If that is larger, the system does not have to create a physical file as often. The default value in earlier releases was to create a new physical file every 5000 records, the default value has been changed to *DAILY starting on release 7.3. The value of *DAILY for the QHSTLOGSIZ is the recommended value for any release of the OS.
Some of the common reasons that messages stop logging to QHST are as follows:
| 1. | Excessive messages. A job or a large number of jobs generate messages faster than the system can move them from the QHST message queue to the log-file. |
| 2. | The QHST message queue is corrupted. No messages are able to be logged to QHST. |
| 3. | There is a problem with the SCPF job that is preventing it from removing messages from the QHST message queue. |
| 4. | Unable to create additional objects in library QSYS. |
What are some potential messages that indicate a problem with QHST?
| 1. | CPF2469 - Error occurred when sending message&1. |
| 2. | CPF2460 - Message queue QHST could not be extended. |
| 3. | CPF4167 - Job cannot create any more spooled files. If this is in SCPF job, then it may not be able to remove messages from the QHST message queue due to some corruption. |
| 4. | CPF2503 - Message queue for system log QHST damaged. |
| 5. | CPF2477 - Message queue QHST currently in use. Seen by issuing the DSPLOG command or sending a message to QHST. |
| 6. | CPD2537 - All messages have not been logged to the history log. Seen by issuing the DSPLOG command. |
| 7. | CPI2413 – Run priority for SCPF has been changed to 0. Indicates the system recognized it was falling behind moving messages from QHST message queue to the file log. |
| 8. | CPI2414 - Original run priority for SCPF has been reinstated. System determines is has caught up with QHST message logging. |
| 9. | CPF2456 - Log version &1 in &2 closed and should be saved. |
| 10. | CPF2553 - Message queue &1 extended. |
| 11. | CPD2446 - Message queue was extended. |
| 12. | CPD2120 - Cannot add new objects to library xxxx |
You should try the following steps when a QHST problem occurs:
| 1. |
Try to send a message to QHST:
SNDMSG TEST QHST. What happens? If this fails, send in the job log showing the command and its error messages.
|
|
| 2. | Dump QHST message queue (will show IBM if it is corrupted): DMPOBJ QHST *MSGQ | |
| 3. | What happens with DSPLOG or DSPLOG with a time specified? If it fails, send in the job log showing the command and its error messages. If it is successful, send in the last page from the current date. | |
| 4. | SCPF job log. Use WRKJOB JOB(000000/QSYS/SCPF) OUTPUT(*PRINT) OPTION(*ALL). In releases 7.4 and above issue the command WRKACTJOB JOB(*SYS) and look for the QHST job. | |
| 5. | The QMHLOGER module needs to be running in the SCPF or QHST job. Get multiple dumps of the SCPF job call stack. Use WRKJOB JOB(000000/QSYS/SCPF) OUTPUT(*PRINT) OPTION(*PGMSTK). In releases 7.4 and above issue the command WRKACTJOB JOB(*SYS) and look for the QHST job. | |
| 6. | If the SCPF job is related to the QHST issue, collect any spooled files created under the SCPF job as this information may be related. | |
| 7. |
Is the message queue in use?
Use the command WRKOBJLCK OBJ(QSYS/QHST) OBJTYPE(*MSGQ) and look to see if there any jobs locking the message queue. It may be that the jobs locking the message queue are preventing other jobs from adding data to the QHST message queue.
|
|
| 8, |
In releases 7.4 and above, issue the command WRKACTJOB JOB(*SYS) and look for the QHST job. Is the QHST job active? If not active, you will have to restart it manually.
To restart the QHST job manually issue:
CALL PGM(QWCCTLSJ) PARM(*RESTART QHST)
|
|
| 9. |
After all necessary data is collected, if logging has stopped and issuing the DSPLOG command does not cause QMHLOGER to start running in the SCPF job, and you do not mind losing messages that are in the QHST message queue, sometimes deleting QHST (DLTMSGQ QHST) will get logging started again. After the delete, try SNDMSG TEST QHST and then issue the DSPLOG command a couple times to see if the TEST message appears using DSPLOG.
Issue DLTMSGQ QHST, you should get a message that the QHST message queue was recreated.
Issue SNDMSG TEST QHST and issue DSPLOG, does it work? Do you see the message that you sent to the message queue?
This may not work until you are able to delete some objects in QSYS if the library is full. See Step 1 on the following table.
|
|
| 10. |
Change the system value QHSTGLOGSIZ to *DAILY:
CHGSYSVAL SYSVAL(QHSTLOGSIZ) VALUE(*DAILY)
|
Before deleting and recreating the QHST MSGQ you may need to take the following steps to determine if the library QSYS is full.
| 1. |
Use the command DSPLIB QSYS and look at the top of the screen on the right, it will display the number of objects in QSYS.
For releases 7.1, 7,2 the maximum number of objects in a library is 360,000, for releases 7.4 and above is 1,000,000 objects. If you are hitting the limit, you need to clean/delete objects in the library to recover from the errors affecting the QHST message queue.
|
| 2. |
Use the following SQL to determine the number of QHSTxxxx files in QSYS:
To programmatically delete the QHSTxxx files from QSYS see item 3 on this table.
|
| 3. |
Once you have concluded your investigation, you can delete the QHSTxxx files from QSYS programmatically:
Use the following SQL:
NOTE: Adjust the number of days to keep QHSTxxx files on the system. The sample SQL will keep 15 days worth of QHSTxxx files on the system.
NOTE 1: This SQL only works on releases 7.3 and above and you should run it in the ACS RUN SQL tool.
NOTE 2: This SQL can take a very long time to run depending on the number of QHSTxxx files on the system.
For earlier releases, you can submit a job to delete all of the QHSTxxx files on QSYS.
This method does NOT keep any history files in QSYS and you will loose any data on the QHSTxxx files from your system.
|
Post recovery steps: Once you have recovered and logging on the History Log has resumed, review the steps below:
| 1. |
If you suspect that a job(s) is flooding the message queue with a set of messages causing the message queue to fill up, you can use the following SQL to get a count by Message ID.
You will get a listing of the top 25 messages in the history log. You can use the command WRKMSGD for each of the messages listed on the report to identify the message(s) flooding the history log.
NOTE: Update the start and end time/date on the SQL. Limit the time/date period to about 1 hour during the time the flooding of messages is suspected to have occurred.
NOTE 1: This SQL only works on releases 7.2 and above and you should run it in the ACS RUN SQL tool.
NOTE 2: This SQL may not provide you with the necessary information as you may have deleted the QHSTxxxx files on Step 3 of the previous table to recover.
NOTE 3: This SQL may take a very long time to run and may not even complete because there may be too many QHSTxxx files in QSYS.
|
| 2 | To monitor for QHST filling the Start Watch (STRWCH) command to start a watch and then a user exit program could be created to act on the watch event. For example, it could watch for CPI2413 being sent to QSYSOPR. STRWCH SSNID(WCHCPI2413) WCHPGM(MYLIB/MONQHST) WCHMSG((CPI2413)) |
| 3 |
We recommend getting and staying current on the latest PTFs for QMHLOGER and QMHLDISP modules.
Search for the modules on the site:
|
[{"Type":"MASTER","Line of Business":{"code":"LOB68","label":"Power HW"},"Business Unit":{"code":"BU070","label":"IBM Infrastructure"},"Product":{"code":"SWG60","label":"IBM i"},"ARM Category":[{"code":"a8m0z0000000CHAAA2","label":"Operating System"}],"ARM Case Number":"TS015403085","Platform":[{"code":"PF012","label":"IBM i"}],"Version":"7.1.0;7.2.0;7.3.0;7.4.0;7.5.0"}]
Historical Number
516471946
Was this topic helpful?
Document Information
Modified date:
11 November 2024
UID
nas8N1013129