Using traps in your scripts
Trapping signals
When writing scripts, it is good practice to have a controlled exit from your script; this allows for failed conditions within the script processing. Consider a script that copies or replaces certain files in a file system. You could check if each copy completes successfully before moving on to the next task in the script. If issues occur, then the script exits. This allows the system administrator to inspect where the script failed so that immediate action can be taken to back-out the process or take an alternative action in completing the task.
Listing 1 below contains basic conditional code that could achieve this goal. Using a file copy process as an example, a test is carried out to make sure the file run_pj actually exists. If it does, then a copy is carried out to take a backup of the destination file. If the copy is unsuccessful, then the script exits with a message, detailing the error. If the file is not present, then the script exits, as no more processing should be carried out. If the copy was successful, then the new updated file is copied and overwrites the original file. If this is not successful, then the script exits.
Listing 1. Example_replace
#!/bin/bash # proj_dir=/opt/pcake/bin # check file is present if [ ! -f "$proj_dir/run_pj" ] then echo " $proj_dir/run_pj not present...exiting" exit 1 fi # make a backup copy cp -p $proj_dir/run_pj $proj_dir/run_pj.24042011 if [ $? != 0 ] then echo "$proj_dir/run_pj no backup made...exiting" exit 1 fi # copy over updated file if [ ! -f "/opt/dump/rollout/run_pj" ] then echo "/opt/dump/rollout/run_pj not present...exiting" exit 1 fi cp -p /opt/dump/rollout/run_pj $proj_dir/run_pj if [ $? != 0 ] then echo " $proj_dir/run_pj was not copied..exiting" exit 1 fi
Using the approach in Listing 1, the script exits if there is any error in the copy process, thus not allowing the script to carry on processing if there is an error. Clearly, any error would be fixed before the script is run again.
Another technique to check for errors and exit is to use the set option:
set -e
With the set option: -e, if a command fails (that is, it returns a non-zero
exit status), the script exits (unless it is part of a iteration,
&&, || command). The example shown in Listing 2 below, copies a
non-existent file. The set -e option is used. If the copy
command fails, the script exits. Notice that when you run the command, the
if statement for the last exit status is never reached because the script
exits upon a non-zero return status of the cp command.
Listing 2. Example_fail
#!/bin/bash set -e proj_dir=/opt/rollout/v12 # copy a non-existent file cp $proj_dir/go_sup /usr/local/bin/go_sup if [ $? != 0 ] then echo "could not copy $proj_dir/go_sup to /usr/local/bin/" exit 1 fi $ cp_test cp: /opt/rollout/v12/go_sup: A file or directory in the path name does not exist.
Generating syslog messages
Using the logger command allows the shell and scripts to write
messages to the system messages file via the syslogd service.
This can be used within a script to log errors or on completions of your
processes so that is viewable by all who interrogate the messages file.
Thus keeping you and other system administrators informed of events that
have been generated from your scripts.
The most basic format of the command is:
logger -p priority message
Where -p is the priority or facility level contained within syslog.
For example, the following logger command contains the calling script name
("rollout" in this example) with the message
something has
happened.
logger -p notice "$(basename $0) - something has happened"
The the following output appears in /var/adm/messages:
Apr 5 13:20:30 uk01wrs6008 user:notice dxtans: rollout - something has happened
Getting a signal
The two examples contained in Listing 1 and Listing 2 shows one way that checking post command execution can be carried out. However, what happens if a script gets terminated during its execution? Scripts can be killed or terminated using the signal mechanism (note that not all signals sent are terminal). A signal that is sent to a running process interrupts that process to force some sort of event, typically some action. Signals can come from, but not restricted to:
- The kernel or user space via some system event.
- The actual process itself via the keyboard (Ctrl-C).
- An illegal instruction from within the process.
- Another process via another user sending a kill to your process.
- Notification via a notification of the state of a required device.
To view the current list of signals, use kill -l (the letter l) command. The list is presented in the form (signal number, signal name):
$ kill -l 1) SIGHUP 2) SIGINT 3) SIGQUIT 4) SIGILL 5) SIGTRAP 6) SIGABRT 7) SIGEMT 8) SIGFPE 9)SIGKILL 10) SIGBUS 11) SIGSEGV 12) SIGSYS ….... …....
To view the signals and their default actions (on an AIX machine), view the file:
$ cat /usr/include/sys/signal.h|more ….. ….. #define SIGHUP 1 /* hangup, generated when terminal disconnects */ #define SIGINT 2 /* interrupt, generated from terminal special char */ #define SIGQUIT 3 /* (*) quit, generated from terminal special char */ #define SIGILL 4 /* (*) illegal instruction (not reset when caught)*/ #define SIGTRAP 5 /* (*) trace trap (not reset when caught) */ #define SIGABRT 6 /* (*) abort process */ ….. …..
I have received a signal. Now what?
When a signal has been received by the script, the script can do one of three actions:
- Ignore it and do nothing. This is probably what most scripts do without the script authors realising it.
- Catch the signal using trap and take appropriate action.
- Take the default action.
All the above is true except for the following signals:
SIGKILL (signal 9)
SIGSTOP (signal 17)
SIGCONT (signal 19)
These cannot be caught and always uses the default action.
SIGKILL always kills the process. Looking at the listing from
the /usr/include/sys/signal.h file, we see the default action
for each signal. For instance, SIGINT (signal 2) is an
interrupt generated from the terminal; typically, this is the keyboard.
Each defined system signal has a different action. There are also two user
defined signals: SIGUSR1 (signal 30) and SIGUSR2
(signal 31).
These can be used by the script author to do bespoke signals. Be sure to
view the signal.h file for all the default actions.
Common signals are:
SIGHUP- hangup or exit a foreground running process from a terminalSIGINT- Ctrl-C from the keyboardSIGQUIT- Ctrl-\ from the keyboardSIGTERM- software termination signal
When receiving a signal, actions that can take place are:
- cleaning up files
- prompting the users if the script should be actually terminated
- ignoring the actual signal
- carry on processing
Catching a signal
To catch a signal that is sent to your process, use the built-in trap
command. When a signal is caught, the current command being executed
attempts to complete before the trap command takes over. If it is a
SIGKILL, then termination is immediate. If you ignore certain
signals, the default action always take place. For example, if you only
trap for SIGINT but do nothing about SIGQUIT,
then if your process gets a SIGQUIT, the default action takes
place (most likely an untidy termination of your script, which you
probably do not want).
The format of the trap command is:
trap 'command_list' signals
Where command_list is a list of commands, which can include a
function to run upon receiving a signal contained in the signals list.
And, signals is a list of signals to catch or trap.
To ignore a signal, use two single quotes in place of the command_list:
trap '' signals
To reset a trap use:
trap - signals
Where signals is the signal list.
Lets now look at a bare bones script that catches SIGINT and
SIGQUIT. The script contained in Listing 3 below is a counter
iteration script. When the user hits Ctrl-C or Ctrl-\ on the keyboard, the
trap command traps the signal, and echoes a message that the script has
terminated. The termination is accomplished by using the exit command at
the end of the command list. If this is not done, the script does not
terminate and continues processing. In this example, we want it to
terminate. There may be occasions when this would not be the case and
processing should continue.
Listing 3. Trap1
#!/bin/bash # trap1 trap 'echo you hit Ctrl-C/Ctrl-\, now exiting..; exit' SIGINT SIGQUIT count=0 while : do sleep 1 count=$(expr $count + 1) echo $count done $ trap1 1 2 3 ^Cyou hit Ctrl-C/Ctrl-\, now exiting..
You can also use a function in place of the command as demonstrated in Listing 4 below:
Listing 4. Trap1a
#!/bin/bash
# trap1a
trap 'my_exit; exit' SIGINT SIGQUIT
count=0
my_exit()
{
echo "you hit Ctrl-C/Ctrl-\, now exiting.."
# cleanp commands here if any
}
while :
do
sleep 1
count=$(expr $count + 1)
echo $count
doneSignals can also be caught, when a script is running in the background. Listing 5 below, contains a simple counter as in the previous examples. In the following example, I have again chosen to exit the script upon catching the signal. If this was a file processing script, temporary files created would be deleted first.
The script is submitted into the background using:
$ /home/dxtans/trapbg & [1] 708790 $ 1 2 3
Now from another terminal, send a signal SIGHUP to kill
it.
$ ps -ef |grep trapbg dxtans 708790 2457860 11:49:39 pts/0 0:00 /bin/bash /home/dxtans/trapbg $ kill -1 708790
Now back on the terminal where the script was submitted, the following is displayed:
$ /home/dxtans/trapbg & [1] 708790 $ 1 2 3 Going down on a SIGHUP - signal 1, now exiting.. [1]+ Done /home/dxtans/trapbg
Listing 5. trapbg
#!/bin/bash # trapbg trap 'echo Going down on a SIGHUP - signal 1, now exiting..; exit' SIGHUP count=0 while : do sleep 10 count=$(expr $count + 1) echo $count done
The most common tasks when dealing with signals is to clean up temporary
files. Typically, these are created with the PID (the script process pid)
that are appended to the user created files in /tmp. Assume
the temp files are in this form:
hold1.$$ hold2.$$
A common command to remove these files is:
rm /tmp/hold*.$$
The following piece of code traps for SIGNHUP SIGINT SIGQUIT SIGTERM then remove the files:
trap 'rm /tmp/hold*.$$; exit' SIGNHUP SIGINT SIGQUIT SIGTERM
Earlier in this article, I demonstrated that using set -e
causes a script to terminate upon an occurrence on a non-zero exit status
from a command. Within trap, you have a similar option; it is not really a
signal as such but is based on set -e as if it was invoked.
It traps a non-zero exit status from a command, using the ERR
variable. The ERR goes with the signal list within the trap
command. In the following example, a non-existent file is copied, which
invokes an error:
#!/bin/bash # trap1b trap 'echo I have error in my script..' ERR cp /home/dxtans/afile /tmp
When executed, the output is:
$ trap1b cp: /home/dxtans/afile: A file or directory in the path name does not exist. I have error in my script.
There are two variables that come in handy when dealing with traps to give
you more information on the script termination, LINENO and
BASH_COMMAND. The BASH_COMAMND is exclusive to
bash. These report, or attempt to report, the line number that the script
is currently executing, and also the current command that is running. The
following example, Listing 6 below, demonstrates this. The script executes
a list of echo and sleep commands. When the script is sent either a
SIGHUP,
SIGINT, SIGQUIT, the script terminates. A message
displays containing the line number and command when the trap was caught;
the script then exits (from the exit command on the trap command list).
Notice that the trap calls the function my_exit to display the
information. By parsing the parameters $1 (LINENO) and $2
(BASH_COMMAND), it also logs a message to
/var/adm/messages of the event. Other clean up commands would
be put in this function, if required.
Listing 6. trap4
#!/bin/bash
# trap4
trap 'my_exit $LINENO $BASH_COMMAND; exit' SIGHUP SIGINT SIGQUIT
my_exit()
{
echo "$(basename $0) caught error on line : $1 command was: $2"
logger -p notice "script: $(basename $0) was terminated: line: $1, command was $2"
# cleanp commands here if any
}
echo 1
sleep 1
echo 2
sleep 1
echo 3Running this script a couple of times, and then interrupting at different intervals, produces the following output.
$ trap4 1 2 ^Ctrap4 caught error on line : 15 command was: sleep $ trap4 1 ^Ctrap4 caught error on line : 13 command was: sleep
In /var/adm/messages, we have an entry for the script termination:
Apr 6 12:12:46 rs6000 user:notice dxtans: script: trap4 was terminated: line: 13, command was sleep
There are occasions when you will want to ignore certain signals. Perhaps you wish to prevent someone hitting Ctrl-C or Ctrl-\ on the keyboard by mistake when your script is doing some processing on large files, and you wish it to complete, without user interruption. The following segment of code achieves this:
trap '' SIGINT SIGQUIT
You can also ignore certain signals during a portion of your script, then re-instate them later on when you do wish to catch the signals so you can take some form of action. The script contained in Listing 7 below ignores the signals SIGINT and SIGQUIT until after the sleep command has finished. Then when the next sleep command starts, trap takes action if the signals are sent and terminates. As in the previous examples, you can assume the sleep commands represent some form of processing.
Listing 7. trapoff_on
#!/bin/bash # trapoff_on trap '' SIGINT SIGQUIT echo "you cannot terminate using ctrl-c or ctrl-\, " # heavy pressing go on here, cannot interrupt ! sleep 10 trap 'echo terminated; exit' SIGINT SIGQUIT # user can now interrupt echo "ok you can now terminate me using those keystrokes" sleep 10
Sending a signal to a child
Scripts that contain child processes also need to be addressed. Assuming
you wish to terminate any child processes, you need to kill these as well.
This is accomplished using the trap command as demonstrated in Listing 8
below. In this example, two sleep commands are used as the child
processes. These are put into the background; as each process is run, the
PID of the process is placed into the variable: $pid. This
variable holds the two PIDS of the child (sleep) processes.
To kill the main script, either a SIGHUP,SIGINT,SIGQUIT or
SIGTERM is sent. Upon catching this signal, a kill command is
issued to the PID of the child processes contained in the variable
$pid. Once completed, the script exits. The wait at the end
of the script will wait for the child processes to terminate or complete.
Further signal traps may be required that would be contained within the
child scripts to do further cleaning up before exit. Clearly, this depends
on your type of processing.
The following example kills the children when the parent is sent one of the signals.
Listing 8. trapchild
#!/bin/bash # trapchild sleep 120 & pid="$!" sleep 120 & pid="$pid $!" echo "my process pid is: $$" echo "my child pid list is: $pid" trap 'echo I am going down, so killing off my processes..; kill $pid; exit' SIGHUP SIGINT SIGQUIT SIGTERM wait
Upon execution of the script, the following displays:
$ /home/dxtans/trap/trapchild my process pid is: 6553626 my child pid list is: 5767380 6488072
Check from the terminal that the processes are running, along with the child processes (the two sleep commands).
$ ps -ef |grep trapchild
root 6553626 5439516 0 20:51:32 pts/1 0:00 /bin/bash /home/dxtans/trap/trapchild
$ ps -ef |grep sleep
root 5767380 6553626 0 20:51:32 pts/1 0:00 sleep 120
root 6488072 6553626 0 20:51:32 pts/1 0:00 sleep 120Let's now send a SIGTERM to the parent process. The script
terminates and terminates the child processes.
$ kill -15 6553626
The script then terminates with the following output:
$ /home/dxtans/trap/trapchild my process pid is: 6553626 my child pid list is: 5767380 6488072 I am going down, so killing off my processes..
Check that nothing is returned after the termination:
# ps -ef |grep sleep
Conclusion
Using traps within your scripts requires a little extra effort. The result can be that when a trappable signal is inbound to your script, you will be in a good position to take action.
Downloadable resources
Related topics
- Download the bash shell from the AIX tool box.
- Refer to the AIX v7 documentation center.
- Try out IBM software for free. Download a trial version, log into an online trial, work with a product in a sandbox environment, or access it through the cloud. Choose from over 100 IBM product trials.