Using the message log
Informix message log is an operating-system file. The database server writes status and error information to the message-log file.
To specify the message log path name, set the MSGPATH configuration parameter. Changes to MSGPATH take effect after you shut down and restart the database server.
You should monitor the message log frequently to ensure that the database server is running normally and that events are being logged as expected. Use the onstat -m command to obtain the name of the message log and the last 20 lines from the file. Use a text editor to read the entire message log.
Monitor the message log size, because the database server appends new entries to this file. Edit the log as needed, or back it up to tape and delete it.
If the database server experiences a failure, the message log serves as an audit trail for retracing the events that develop later into a problem. Often the database server provides in the message log the exact nature of the problem and the suggested corrective action.
Four general categories of unnumbered messages exist. Some messages fall into more than one category.
- Routine information
- Assertion-failed messages
- Administrative action needed
- Fatal error detected
Listing 5 shows an example of messages that fall into the routine information category.
Listing 5. Example of routine information messages
15:52:27 Maximum server connections 0 15:52:27 Checkpoint Statistics - Avg. Txn Block Time 0.000, # Txns blocked 0, Plog used 20, Llog used 12 15:52:27 Level 0 Archive started on rootdbs 15:52:28 Archive on rootdbs Completed. 15:53:07 Checkpoint Completed: duration was 0 seconds. 15:53:07 Fri Apr 8 - loguniq 6, logpos 0xdfa018, timestamp: 0xa24db Interval: 21
Listing 6 shows an example of messages that fall into the assertion-failed messages category.
Listing 6. Example of assertion-failed messages
18:39:07 Assert Failed: No Exception Handler 18:39:07 Who: Session(176, informix@testdb, 7263, c000000011012ef0) Thread(6457, xchg_3.0, c000000010feab00, 3) File: mtex.c Line: 491 18:39:07 Results: Exception Caught. Type: MT_EX_OS, Context: mem 18:39:07 Action: Please notify IBM Informix Technical Support. 18:39:07 See Also: /home/dump/af.1d21d84a, shmem.1d21d84a.0
Listing 7 shows an example of messages that fall into the administrative action needed messages category.
Listing 7. Example of administrative action needed messages
11:05:49 Maximum server connections 4 11:08:05 Logical Log Files are Full -- Backup is Needed
Listing 8 shows an example of messages that fall into the fatal error messages category.
Listing 8. Example of fatal error messages
20:29:19 Assert Failed: Unexpected virtual processor termination, pid = 25504, exit = 0x9 20:29:19 IBM Informix Dynamic Server Version 11.70.FC1 20:29:19 Who: Session(2, informix@, 0, 0) Thread(9, soctcppoll, 0, 1) File: mt.c Line: 14549 20:29:19 stack trace for pid 25503 written to /home/informix/1170/tmp/af.3f1739e 20:29:19 See Also: /home/informix/1170/tmp/af.3f1739e 20:29:24 mt.c, line 14549, thread 9, proc id 25503, Unexpected virtual processor termination, pid = 25504, exit = 0x9 . 20:29:24 The Master Daemon Died 20:29:24 PANIC: Attempting to bring system down
The database server provides a mechanism for automatically triggering administrative actions based on an event that occurs in the database server environment. Events can be informative, such as backup complete, or they can indicate an error condition that requires your attention, such as unable to allocate memory. To use the event-alarm feature, set the ALARMPROGRAM configuration parameter to the full pathname of an executable file that performs the necessary administrative actions.
The database server can execute a program that operates either whenever certain noteworthy event alarms occur or every time any event alarm occurs. Noteworthy event alarms include failure of a database; table, index, chunk, or dbspace taken offline; internal subsystem failure; start-up failure; and detection of long transaction. You can receive notification of an event alarm through e-mail or pagermail.
The following configuration parameters are specific to event-alarms.
- Specifies whether ALARMPROGRAM runs for all events that are logged in the MSGPATH or for only specified noteworthy events
- Specifies the location of a file that is executed when an event alarm occurs
Follow these steps to customize the alarmprogram.[sh|bat] script. You can use alarmprogram.[sh|bat] instead of log_full.[sh|bat] to automate log backups.
- Change the value of ADMINMAIL to the email address of the database server administrator.
- Change the value of PAGERMAIL to the pager service email address.
- Set the value of the parameter MAILUTILITY with /usr/bin/mail for UNIX and $INFORMIXDIR/bin/ntmail.exe for Windows.
- To automatically back up logical logs as they fill, change BACKUP to yes. To stop automatic log backups, change BACKUP to any value other than yes.
- In the ONCONFIG file, set ALARMPROGRAM to the full pathname of alarmprogram.[sh|bat].
- Restart the database server.
Alarms with a severity of 1 or 2 do not write any messages to the message log nor send email. Alarms with severity of 3 or greater send email to the database administrator. Alarms with severity of 4 and 5 also notify a pager via email.
To ensure continuous server availability, do not run certain foreground operations in an alarm script. When the server invokes an alarm script, the server sometimes waits for the script to complete before proceeding. For example:
- When an alarm is invoked because of a fatal error, the server waits for the script to finish writing information to the error log. In certain situations, alarm events 5 and 6 are run in the foreground.
- Some enterprise replication event alarms run in the foreground, such as event alarms 31, 34, 37, and 39.
Because the server might need to wait for the alarm program script to finish, do not run the following operations in the foreground in an alarm script:
- An onmode command that forces user connections off the server, such as onmode -u or onmode -yuk. These kinds of onmode commands can cause a deadlock between the server and the alarm script because the server might wait for the alarm script to complete while the alarm script that executed the onmode command waits for the user sessions to shut down, and one of those sessions is running the alarm script itself.
- Operations that might take a long time to complete or that have a highly variable run time. Operations that take a long time to complete can cause the server to appear as if it is not responding while the operation is running.
If you need to run the above operations in an alarm script, run them in the background using one of the following operating system utilities:
- On UNIX, the nohup utility with
nohup onmode -yuk &instructs nohup to continue running the command even if its parent terminates. The ampersand, &, runs the command in the background so it will not block execution of the alarm program script itself.
- On Windows, use the start utility with the /B flag, such as
start /B onmode -yuk.
Some of the events that the database server reports to the message log also cause it to invoke the alarm program. The class messages indicate the events that the database server reports. The database server reports a nonzero exit code in the message log. In the alarm program, set the EXIT_STATUS variable to 0 for successful completion and set the variable to another number for a failure. For example, if a thread attempts to acquire a lock, but the maximum number of locks has already been reached, the database server writes a message to the message log, as shown in Listing 9.
Listing 9. Example error message in the message log
10:37:22 Checkpoint Completed: duration was 0 seconds. 10:51:08 Lock table overflow - user id 30032, rstcb 10132264 10:51:10 Lock table overflow - user id 30032, rstcb 10132264 10:51:12 Checkpoint Completed: duration was 1 seconds.
All event alarms that are generated are inserted in the ph_alert table in the sysadmin database. You can query the ph_alert table on the local or remote server to view the recent event alarms for that server. You can write SQL scripts based on the ph_alert table to handle event alarms instead of using the scripts controlled by the ALARMPROGRAM configuration parameter. By default, alerts remain in the ph_alert table for 15 days before being purged.
Listing 10 shows an event alarm in the ph_alert table:
Listing 10. Querying the ph_alerts table in sysadmin
SELECT * FROM ph_alerts WHERE alert_object_type=ALARM;
Listing 11 shows the resulting output.
Listing 11. Output of querying the ph_alerts table in sysadmin
id 34 alert_task_id 18 alert_task_seq 10 alert_type INFO alert_color YELLOW alert_time 2010-03-08 12:05:48 alert_state NEW alert_state_chang+ 2010-03-08 12:05:48 alert_object_type ALARM alert_object_name 23 alert_message Logical Log 12 Complete, timestamp: 0x8e6a1. alert_action_dbs sysadmin alert_action alert_object_info 23001