When writing scripts, it is good practice to have a controlled exit from your script; this allows for failed conditions within the script processing. Consider a script that copies or replaces certain files in a file system. You could check if each copy completes successfully before moving on to the next task in the script. If issues occur, then the script exits. This allows the system administrator to inspect where the script failed so that immediate action can be taken to back-out the process or take an alternative action in completing the task.
Listing 1 below contains basic conditional code that could achieve this goal. Using a file copy process as an example, a test is carried out to make sure the file run_pj actually exists. If it does, then a copy is carried out to take a backup of the destination file. If the copy is unsuccessful, then the script exits with a message, detailing the error. If the file is not present, then the script exits, as no more processing should be carried out. If the copy was successful, then the new updated file is copied and overwrites the original file. If this is not successful, then the script exits.
Listing 1. Example_replace
#!/bin/bash # proj_dir=/opt/pcake/bin # check file is present if [ ! -f "$proj_dir/run_pj" ] then echo " $proj_dir/run_pj not present...exiting" exit 1 fi # make a backup copy cp -p $proj_dir/run_pj $proj_dir/run_pj.24042011 if [ $? != 0 ] then echo "$proj_dir/run_pj no backup made...exiting" exit 1 fi # copy over updated file if [ ! -f "/opt/dump/rollout/run_pj" ] then echo "/opt/dump/rollout/run_pj not present...exiting" exit 1 fi cp -p /opt/dump/rollout/run_pj $proj_dir/run_pj if [ $? != 0 ] then echo " $proj_dir/run_pj was not copied..exiting" exit 1 fi |
Using the approach in Listing 1, the script exits if there is any error in the copy process, thus not allowing the script to carry on processing if there is an error. Clearly, any error would be fixed before the script is run again.
Another technique to check for errors and exit is to use the set option:
set -e |
With the set option: -e, if a command fails (that is, it returns a non-zero exit
status), the script exits (unless it is part of a iteration, &&, || command).
The example shown in Listing 2 below, copies a non-existent file. The set -e option is used. If the copy command fails, the script
exits. Notice that when you run the command, the if statement for the last exit status
is never reached because the script exits upon a non-zero return status of the cp command.
Listing 2. Example_fail
#!/bin/bash set -e proj_dir=/opt/rollout/v12 # copy a non-existent file cp $proj_dir/go_sup /usr/local/bin/go_sup if [ $? != 0 ] then echo "could not copy $proj_dir/go_sup to /usr/local/bin/" exit 1 fi $ cp_test cp: /opt/rollout/v12/go_sup: A file or directory in the path name does not exist. |
Using the logger command allows the shell and scripts to
write messages to the system messages file via the syslogd service. This can be used within a script to log errors or on completions of
your processes so that is viewable by all who interrogate the messages file. Thus
keeping you and other system administrators informed of events that have been generated from your scripts.
The most basic format of the command is:
logger -p priority message |
Where -p is the priority or facility level contained within syslog.
For example, the following logger command contains the calling script name ("rollout" in this
example) with the message something has
happened.
logger -p notice "$(basename $0) - something has happened" |
The the following output appears in /var/adm/messages:
Apr 5 13:20:30 uk01wrs6008 user:notice dxtans: rollout - something has happened |
The two examples contained in Listing 1 and Listing 2 shows one way that checking post command execution can be carried out. However, what happens if a script gets terminated during its execution? Scripts can be killed or terminated using the signal mechanism (note that not all signals sent are terminal). A signal that is sent to a running process interrupts that process to force some sort of event, typically some action. Signals can come from, but not restricted to:
- The kernel or user space via some system event.
- The actual process itself via the keyboard (Ctrl-C).
- An illegal instruction from within the process.
- Another process via another user sending a kill to your process.
- Notification via a notification of the state of a required device.
To view the current list of signals, use kill -l (the letter l) command. The list is presented in the form (signal number, signal name):
$ kill -l 1) SIGHUP 2) SIGINT 3) SIGQUIT 4) SIGILL 5) SIGTRAP 6) SIGABRT 7) SIGEMT 8) SIGFPE 9)SIGKILL 10) SIGBUS 11) SIGSEGV 12) SIGSYS ….... ….... |
To view the signals and their default actions (on an AIX machine), view the file:
$ cat /usr/include/sys/signal.h|more ….. ….. #define SIGHUP 1 /* hangup, generated when terminal disconnects */ #define SIGINT 2 /* interrupt, generated from terminal special char */ #define SIGQUIT 3 /* (*) quit, generated from terminal special char */ #define SIGILL 4 /* (*) illegal instruction (not reset when caught)*/ #define SIGTRAP 5 /* (*) trace trap (not reset when caught) */ #define SIGABRT 6 /* (*) abort process */ ….. ….. |
I have received a signal. Now what?
When a signal has been received by the script, the script can do one of three actions:
- Ignore it and do nothing. This is probably what most scripts do without the script authors realising it.
- Catch the signal using trap and take appropriate action.
- Take the default action.
All the above is true except for the following signals:
SIGKILL (signal 9)
SIGSTOP (signal 17)
SIGCONT (signal 19)
These cannot be caught and always uses the default action. SIGKILL always kills the process. Looking at the listing from the
/usr/include/sys/signal.h file, we see the default
action for each signal. For instance, SIGINT (signal 2) is
an interrupt generated from the terminal; typically, this is the keyboard. Each
defined system signal has a different action. There are also two user defined signals:
SIGUSR1 (signal 30) and SIGUSR2
(signal 31).
These can be used by the script author to do bespoke signals. Be sure to view the signal.h file for all the default actions.
Common signals are:
SIGHUP- hangup or exit a foreground running process from a terminalSIGINT- Ctrl-C from the keyboardSIGQUIT- Ctrl-\ from the keyboardSIGTERM- software termination signal
When receiving a signal, actions that can take place are:
- cleaning up files
- prompting the users if the script should be actually terminated
- ignoring the actual signal
- carry on processing
To catch a signal that is sent to your process, use the built-in trap command. When a
signal is caught, the current command being executed attempts to complete before the
trap command takes over. If it is a SIGKILL, then
termination is immediate. If you ignore certain signals, the default action always
take place. For example, if you only trap for SIGINT but do
nothing about SIGQUIT, then if your process gets a SIGQUIT, the default action takes place (most likely an untidy
termination of your script, which you probably do not want).
The format of the trap command is:
trap 'command_list' signals |
Where command_list is a list of commands, which can include
a function to run upon receiving a signal contained in the signals list. And, signals is a list of signals to catch or trap.
To ignore a signal, use two single quotes in place of the command_list:
trap '' signals |
To reset a trap use:
trap - signals |
Where signals is the signal list.
Lets now look at a bare bones script that catches SIGINT
and SIGQUIT. The script contained in Listing 3 below is a
counter iteration script. When the user hits Ctrl-C or Ctrl-\ on the keyboard, the
trap command traps the signal, and echoes a message that the script has terminated. The
termination is accomplished by using the exit command at the end of the command list.
If this is not done, the script does not terminate and continues processing. In this
example, we want it to terminate. There may be occasions when this would not be the case and processing should continue.
Listing 3. Trap1
#!/bin/bash # trap1 trap 'echo you hit Ctrl-C/Ctrl-\, now exiting..; exit' SIGINT SIGQUIT count=0 while : do sleep 1 count=$(expr $count + 1) echo $count done $ trap1 1 2 3 ^Cyou hit Ctrl-C/Ctrl-\, now exiting.. |
You can also use a function in place of the command as demonstrated in Listing 4 below:
Listing 4. Trap1a
#!/bin/bash
# trap1a
trap 'my_exit; exit' SIGINT SIGQUIT
count=0
my_exit()
{
echo "you hit Ctrl-C/Ctrl-\, now exiting.."
# cleanp commands here if any
}
while :
do
sleep 1
count=$(expr $count + 1)
echo $count
done
|
Signals can also be caught, when a script is running in the background. Listing 5 below, contains a simple counter as in the previous examples. In the following example, I have again chosen to exit the script upon catching the signal. If this was a file processing script, temporary files created would be deleted first.
The script is submitted into the background using:
$ /home/dxtans/trapbg & [1] 708790 $ 1 2 3 |
Now from another terminal, send a signal SIGHUP to kill it.
$ ps -ef |grep trapbg dxtans 708790 2457860 11:49:39 pts/0 0:00 /bin/bash /home/dxtans/trapbg $ kill -1 708790 |
Now back on the terminal where the script was submitted, the following is displayed:
$ /home/dxtans/trapbg & [1] 708790 $ 1 2 3 Going down on a SIGHUP - signal 1, now exiting.. [1]+ Done /home/dxtans/trapbg |
Listing 5. trapbg
#!/bin/bash # trapbg trap 'echo Going down on a SIGHUP - signal 1, now exiting..; exit' SIGHUP count=0 while : do sleep 10 count=$(expr $count + 1) echo $count done |
The most common tasks when dealing with signals is to clean up temporary files.
Typically, these are created with the PID (the script process pid) that are appended
to the user created files in /tmp. Assume the temp files
are in this form:
hold1.$$ hold2.$$ |
A common command to remove these files is:
rm /tmp/hold*.$$ |
The following piece of code traps for SIGNHUP SIGINT SIGQUIT SIGTERM then remove the files:
trap 'rm /tmp/hold*.$$; exit' SIGNHUP SIGINT SIGQUIT SIGTERM |
Earlier in this article, I demonstrated that using set -e
causes a script to terminate upon an occurrence on a non-zero exit status from a
command. Within trap, you have a similar option; it is not really a signal as such but
is based on set -e as if it was invoked. It traps a
non-zero exit status from a command, using the ERR
variable. The ERR goes with the signal list within the trap command. In the following example, a non-existent file is copied, which invokes an error:
#!/bin/bash # trap1b trap 'echo I have error in my script..' ERR cp /home/dxtans/afile /tmp |
When executed, the output is:
$ trap1b
cp: /home/dxtans/afile: A file or directory in the path name does not exist.
I have error in my script.
|
There are two variables that come in handy when dealing with traps to give you more
information on the script termination, LINENO and
BASH_COMMAND. The BASH_COMAMND
is exclusive to bash. These report, or attempt to report, the line number that the
script is currently executing, and also the current command that is running. The
following example, Listing 6 below, demonstrates this. The script executes a list of
echo and sleep commands. When the script is sent either a SIGHUP,
SIGINT, SIGQUIT, the script terminates. A message displays containing the line
number and command when the trap was caught; the script then exits (from the exit
command on the trap command list). Notice that the trap calls the function my_exit to
display the information. By parsing the parameters $1 (LINENO) and $2 (BASH_COMMAND), it also
logs a message to /var/adm/messages of the event. Other
clean up commands would be put in this function, if required.
Listing 6. trap4
#!/bin/bash
# trap4
trap 'my_exit $LINENO $BASH_COMMAND; exit' SIGHUP SIGINT SIGQUIT
my_exit()
{
echo "$(basename $0) caught error on line : $1 command was: $2"
logger -p notice "script: $(basename $0) was terminated: line: $1, command was $2"
# cleanp commands here if any
}
echo 1
sleep 1
echo 2
sleep 1
echo 3
|
Running this script a couple of times, and then interrupting at different intervals, produces the following output.
$ trap4 1 2 ^Ctrap4 caught error on line : 15 command was: sleep $ trap4 1 ^Ctrap4 caught error on line : 13 command was: sleep |
In /var/adm/messages, we have an entry for the script termination:
Apr 6 12:12:46 rs6000 user:notice dxtans: script: trap4 was terminated: line: 13, command was sleep |
There are occasions when you will want to ignore certain signals. Perhaps you wish to prevent someone hitting Ctrl-C or Ctrl-\ on the keyboard by mistake when your script is doing some processing on large files, and you wish it to complete, without user interruption. The following segment of code achieves this:
trap '' SIGINT SIGQUIT |
You can also ignore certain signals during a portion of your script, then re-instate them later on when you do wish to catch the signals so you can take some form of action. The script contained in Listing 7 below ignores the signals SIGINT and SIGQUIT until after the sleep command has finished. Then when the next sleep command starts, trap takes action if the signals are sent and terminates. As in the previous examples, you can assume the sleep commands represent some form of processing.
Listing 7. trapoff_on
#!/bin/bash # trapoff_on trap '' SIGINT SIGQUIT echo "you cannot terminate using ctrl-c or ctrl-\, " # heavy pressing go on here, cannot interrupt ! sleep 10 trap 'echo terminated; exit' SIGINT SIGQUIT # user can now interrupt echo "ok you can now terminate me using those keystrokes" sleep 10 |
Scripts that contain child processes also need to be addressed. Assuming you wish
to terminate any child processes, you need to kill these as well. This is accomplished
using the trap command as demonstrated in Listing 8 below. In this example, two
sleep commands are used as the child processes. These are put into the background; as
each process is run, the PID of the process is placed into the variable: $pid. This variable holds the two PIDS of the child (sleep)
processes.
To kill the main script, either a SIGHUP,SIGINT,SIGQUIT or SIGTERM is sent.
Upon catching this signal, a kill command is issued to the PID of the child
processes contained in the variable $pid. Once
completed, the script exits. The wait at the end of the script will wait for the
child processes to terminate or complete. Further signal traps may be required
that would be contained within the child scripts to do further cleaning up before
exit. Clearly, this depends on your type of processing.
The following example kills the children when the parent is sent one of the signals.
Listing 8. trapchild
#!/bin/bash # trapchild sleep 120 & pid="$!" sleep 120 & pid="$pid $!" echo "my process pid is: $$" echo "my child pid list is: $pid" trap 'echo I am going down, so killing off my processes..; kill $pid; exit' SIGHUP SIGINT SIGQUIT SIGTERM wait |
Upon execution of the script, the following displays:
$ /home/dxtans/trap/trapchild my process pid is: 6553626 my child pid list is: 5767380 6488072 |
Check from the terminal that the processes are running, along with the child processes (the two sleep commands).
$ ps -ef |grep trapchild
root 6553626 5439516 0 20:51:32 pts/1 0:00 /bin/bash /home/dxtans/trap/trapchild
$ ps -ef |grep sleep
root 5767380 6553626 0 20:51:32 pts/1 0:00 sleep 120
root 6488072 6553626 0 20:51:32 pts/1 0:00 sleep 120
|
Let's now send a SIGTERM to the parent process. The script
terminates and terminates the child processes.
$ kill -15 6553626 |
The script then terminates with the following output:
$ /home/dxtans/trap/trapchild my process pid is: 6553626 my child pid list is: 5767380 6488072 I am going down, so killing off my processes.. |
Check that nothing is returned after the termination:
# ps -ef |grep sleep |
Using traps within your scripts requires a little extra effort. The result can be that when a trappable signal is inbound to your script, you will be in a good position to take action.
Learn
- Refer to the AIX v7
documentation center.
Get products and technologies
- Download the bash
shell from the AIX
tool box.
- Try out IBM software for free. Download a trial version, log into an online trial, work with a product in a sandbox environment, or access it through the cloud. Choose from over 100 IBM product trials.
Discuss
- Follow developerWorks on Twitter.
-
Participate in developerWorks blogs and get involved in the developerWorks community.
- Get involved in the My developerWorks community.
-
Participate in the AIX and UNIX® forums:
- AIX Forum
- AIX Forum for developers
- Cluster Systems Management
- Performance Tools Forum
- Virtualization Forum
- More AIX and UNIX Forums





