Level: Intermediate Martin Streicher (martin.streicher@gmail.com), Chief Technology Officer, McClatchy Interactive
25 Sep 2007 This month, discover ten more secrets of the UNIX®
command-line wizards.
This is the thirteenth installment of the "Speaking UNIX" series: an ominous
number, I thought, until I scanned the Internet for the origin of the number's
nefariousness. As it turns out, thirteen is "good" and "bad" in roughly equal
proportions (see Resources).
The good: Thirteen is the atomic number of aluminum, the container of choice for
countless libations; basketball pro Wilt Chamberlain wore number thirteen (and you
all know how "lucky" Wilt was); and in a kind of taboo transform, thirteen is the
seventh prime number, and seven is very lucky.
The bad: There are (evidently) thirteen steps to the gallows; party crashers Loki
and Judas were thirteenth to arrive; and no matter how you cut it—by two,
three, four, or six—a table of thirteen is going to be hard to seat in a
restaurant, which might explain why Loki and Judas are remembered as outsiders.
At best, the jury is hung on thirteen. So, unless you're reading this on the
Friday the 13th, on the thirteenth floor of an office building erected at 1313
Mockingbird Lane (a spot of land with its own history), it's time to celebrate.
"Speaking UNIX" is now a pimply-faced teenager. Here are ten command-line
concoctions and shell snookers to celebrate its passage into puberty. Mazel tov!
Set an
environment variable temporarily
Environment variables, such as EDITOR and TZ, influence the results of commands.
(The former chooses what program to launch to edit text; the latter specifies your
time zone.) You typically set environment variables in your shell startup files to
affect your shell session as a whole, and you can change the value for a shell
session at any time with a command like export TZ=GMT.
Additionally, you can temporarily alter the value of an environment variable for
a single command. Simply set the environment variable at the start of the command
line and the command you want to run. For example, to change your preferred editor
for a single command, preface it with
EDITOR=editor
, as in:
$ printenv
...
EDITOR=vi
...
$ EDITOR="pico" less bigfile
|
This combination pages
bigfile
in less.
If you type v in less to edit the file, pico is
launched instead of vi.
Here's another practical use:
$ date
Sun Aug 5 16:14:17 EDT 2007
$ TZ="Japan" date
Mon Aug 6 05:14:06 JST 2007
|
The temporary change to TZ affects how the immediate instance of
date interprets the current date and time of the
system.
Discover what you're
really running
A great number of shell features affect how the command name you type is
interpreted. Each shell has an assortment of built-in commands; the PATH
environment variable specifies the list and order of directories to search; and
each alias acts as shorthand. With so many ways to run a program, how do you know
what you're actually executing? Use the built-in
type command of the shell to reveal the truth.
Say that you have these shell settings:
PATH=/bin:/usr/bin:/usr/local/bin
alias vi=pico
|
You can find copies of Perl in both /usr/bin and /usr/local/bin. To find
which Perl you're using, type type perl.
$ perl -v
This is perl, v5.8.7 built for darwin-2level
$ type perl
perl is /use/local/bin/perl
$ type -a perl
perl is /usr/local/bin/perl
perl is /usr/bin/perl
$ type -a -w perl
perl: command
perl: command
|
The type perl command reveals how the
perl command is interpreted on the command line. Here,
/usr/local/bin/perl is the expansion. The type -a
command reveals all instances of Perl that the shell is aware of, which depends
largely on the PATH variable.
Try type with some other commands you typically use:
$ type -a vi
vi is an alias for pico
vi is /usr/bin/vi
$ type -a cd
cd is a shell builtin
cd is /usr/bin/cd
|
The type command reveals that
vi is actually an alias for pico. The
type command also shows that
cd is a built-in command and is duplicated
externally as /usr/bin/cd.
Make find more
portable
You've seen many, many uses of find over the past
year, but I omitted one option that makes the find
command lines portable to other operating systems.
By convention, it's unusual to find file names with spaces on a UNIX®
system. However, lengthier, more descriptive file names are common in Mac OS X and
Microsoft® Windows®, and they are becoming more commonplace on UNIX, as
the operating system accumulates more desktop features. After all, saving a report
as 2007 Business Plan is much more obvious that bizplan07.ooo.
The find command enumerates long file names with
embedded special characters but, if you want to combine
find with another command, it's safest to separate the
individual file names in the list with a NUL character rather than a space. Let's
see the difference.
Let's say you have three folders, each with one or more spaces in the name:
$ ls -1
Business Plan 2007
Expense Report
Pictures from Spain
|
If you run find on such a batch of files and pass the
list of results to xargs, the spaces in the file names
cause errors:
$ find . -type f -print | xargs ls -1
ls: ./Business: No such file or directory
ls: ./Expense: No such file or directory
ls: ./Pictures: No such file or directory
ls: 2007: No such file or directory
ls: Plan: No such file or directory
ls: Report: No such file or directory
ls: Spain: No such file or directory
ls: from: No such file or directory
|
The result passed to xargs is the single string
. ./Business Plan 2007 ./Expense Report ./Pictures from Spain.
By default, xargs delimits input with a space (or
newline) to produce a list of files to operate on. Here, because the file names
embed spaces, the rule produces the wrong list, as evidenced above.
The proper, portable technique is to use find -print0,
combined with xargs -0, to delimit file names with the
NUL character. Here's the favored approach:
$ find . -type f -print0 | xargs -0 ls -1
./Business Plan 2007
./Expense Report
./Pictures from Spain
|
By the way, if you want to preview the commands that
xargs produces, add the option
-p or -t. The
-p option displays each fabricated command and prompts
you for authorization. Type upper- or lowercase y to
run the command and anything else to reject it. The -t
option echoes each command to stderr before each
command is executed.
Have even more fun
with find
While find is infinitely useful, it has two implicit
settings that limit its results (and might lead you to scratch your head):
-name matching is case-sensitive and file system
traversals do not follow symbolic links.
Hence, a command that begins find -name '*plan*' omits
files with the string Plan anywhere in the name, and it fails to catalog your
music when your home directory has a symbolic link named music that points to
your terabyte-scale media store mounted on /media/music.
You can override case-sensitive matches with -iname,
and you can traverse symbolic links with -follow.
Here's an example that applies both options:
$ alias ls='ls -aF'
$ ls -1
bin/
lib/
src/
tomb/
tunes@
$ find . -name '*music*' -type f -print
$ find . -iname '*music*' -type f -print
$ find . -name '*music*' -type f -follow -print
$ find . -iname '*music*' -type f -follow -print
./tunes/Muse/Origin Of Symmetry/04 Hyper Music.m4a
./tunes/Radiohead/OK Computer/04 Exit Music (For A Film).mp3
|
As indicated by the @ sign annotation produced by the
-F option, tunes is a symbolic link. To find all
songs with any variant of the string "music" in it, you must use
-iname *music*. To traverse into the hierarchy that
tunes points to, you must use -follow.
To make find even more portable and akin to the search
features of Spotlight, say, use
-print0 -follow -iname pattern
.
Collect the
output of many commands the easy way
You can easily capture the output of a command line by using the
> output
and
>> output
modifiers, where the
former creates or overwrites the file
output
and
the latter appends to
output
. You can combine
either modifier to generate a transcript of a series of commands, which is useful
if you're trying to snapshot system state, for example:
$ ps > state.`date '+%F'`
$ w >> state.`date '+%F'`
|
The back tick or back quote operator
(``) expands commands in place. A command between back
ticks runs as the shell interprets the command line, and the output of the command
is used in the final expansion. Here, the single quotation marks around the
argument keep it intact, preventing the shell from interpreting
+ and %.
After the two commands, the file state.YYYY-MM-DD, such as
state.2007-08-05, is created with contents similar to
this:
PID TTY TIME CMD
9997 pts/1 00:00:00 zsh
10351 pts/1 00:00:00 ps
17:56:04 up 21 days, 2:53, 2 users, load average: 0.89, 0.94, 0.91
USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT
adamgood pts/0 c-67-169-182-255 Sat17 0.00s 0.37s 0.36s pine
mstreich pts/1 cpe-071-065-224- 17:17 0.00s 0.01s 0.00s w
|
Typing the back tick operation each time is a hassle, though. You could replace
the sequence with this:
$ file=state.`date '+%F'`
$ ps > $file
$ w >> $file
|
But that's only a little more efficient and still error prone, because it's
rather easy to use > instead of
>> in the second or subsequent command.
The easiest way to capture the output of a series of commands is to combine them
within braces ({ }).
$ { ps; w } > state.`date '+%F'`
|
The ps command runs (listing the user's current
processes), followed by w (which shows who is using the
machine), and the collected output is captured in a file.
Note: You can also embed a sequence of commands in parentheses to achieve
the same result; however, there is one important difference. The series of commands collected
in parentheses runs in a subshell and does not affect the state of the
current shell.
For example, you might expect the sequence: $ { cd $HOME; ls
-1}; pwd |
to produce the same output as:
The commands in braces change the working directory of the current
shell. The latter technique is inert. Whether to use a combination or a subshell
depends on your intentions—although the subshell is much more powerful, as
described next.
Subshells to the rescue!
While it's common to run a subshell to pipe aggregated output to a single
command, you can also use a subshell to expand a command in place, just like back
ticks. Better yet, a subshell can contain another subshell, so expansions can be
nested, too.
Let's start simply.
$ {ps; w} > state.$(date '+%F')
|
This command is identical to { ps; w } > state.`date '+%F'`.
The $( ) notation runs the commands within the
parentheses, and then replaces itself with the output. In other words,
$() expands in place, just like back ticks. However,
unlike back ticks, $( ) can be very complex and can
even include other $( ) expansions. Here are some
examples:
$ (cd $(grep strike /etc/passwd | cut -f6 -d':'); ls)
|
This command searches the password file for the entry for user strike, clips the
home directory (field six in the password file, if you count from zero) field,
changes to that directory, and lists its contents. The output of
grep /etc/passwd strike | cut -f6 -d':' is expanded
in place before any other operation.
Here's another example, this time with the user name taken from the environment
using whoami:
(cd $(grep $(whoami) /etc/passwd | cut -f6 -d':'); ls)
|
Because the subshell has so many uses, you might prefer to use it always instead of
a combination or the back tick operators.
Stop typing long path names
Features, such as the PATH and MANPATH environment variables, conserve typing. Both
variables define a series of directories to search for executables and man pages,
respectively.
The shell supports another search path: CDPATH. As its name implies, CDPATH
enumerates a list of directories to search for a named directory. Let's see how it
works.
Assume that you have three directories—tomb, current, and
personal—in your home directory. The tomb directory contains old work
projects; current contains things you actively work on; and personal contains
files and the like for your interests. Performing
ls -R tomb current personal reveals something like
this:
$ ls -R tomb current personal
current:
./ ../ einstein/ herbie/
personal:
./ ../ fishing/ novel/
tomb:
./ ../ mariner/ marvin/ voyager/
|
Given this structure, and without CDPATH, changing to any directory requires that
you remember where a folder is located and type its fully qualified (or relative)
path name:
$ cd ~/tomb/mariner
$ cd ~/personal/novel
$ cd ~/current/einstein
|
To simplify this work, set CDPATH to the list of directories you'd like to search
for a named directory:
$ export CDPATH=.:~/:..:../..:
|
This is the minimum setting for CDPATH. It searches, in order, the current
directory (., or "dot"), your home directory (~/), the
parent directory (.., or "dot dot"), and the
grandparent directory (../..). The minimum setting
tends to prefer local directories and relatively close directories.
With this CDPATH set, you can quickly change to any of your topmost directories:
$ pwd
/tmp
$ cd current
/home/strike/current
$ cd personal/fishing
/home/strike/personal/fishing
$ cd novel
/home/strike/personal/novel
$ cd /tmp
$ cd personal/novel
/home/strike/personal/novel
$ cd /tmp
$ cd novel
cd: no such file or directory: novel
|
In each but the last cd command, the argument matched
a directory found in the CDPATH. However, because the personal directory is not
yet in the CDPATH, it cannot be found (if you're outside a relative path).
If you want to search the personal directory and the other two directories, add
them after the last colon or in whatever order you prefer to search. Add the three
directories, assuming that the previous export command
is in your shell startup file:
$ export CDPATH=$CDPATH:~/current:~/tomb:~/personal
|
Now, you can simply type the name of the directory you want to switch to:
$ cd current
/home/strike/current
$ cd /tmp
$ cd einstein
/home/strike/current/einstein
$ cd fishing
/home/strike/personal/fishing
$ cd personal/novel
/home/strike/personal/novel
|
As with PATH and MANPATH, if more than one entry in the CDPATH contains a match,
searching ends at the first match. For example, if you add a directory named
novel to tomb, a cd novel command yields
~/tomb/novel.
$ mkdir ~/tomb/novel
$ cd /tmp
$ cd novel
/home/strike/tomb/novel
$ cd personal/novel
/home/strike/personal/novel
|
CDPATH works best when its entries contain unique directory names. Otherwise,
type enough of the path to differentiate, as was done with personal/novel.
Make less work more
You've seen many, many examples of how extensively text files are used in a UNIX
system. Most system startup files are text files, as are shell scripts,
configuration files and, of course, data files. In addition to a text editor, the
next most useful utility is a pager, or an application that lets you browse
text files page by page.
The application less is one of the most popular pagers, and it offers a
raft of options to tweak its behavior. In fact, you can set the LESS environment
variable to a list of options to control how less works by default. Here's a
collection of useful options:
-
-N displays line numbers.
-
-m displays the current position in the file as a
percentage.
-
-s "squeezes," or reduces, multiple blank lines
into a single blank line.
- -
x4 sets a tab stop every four spaces.
Spend some time with the less man page to find the options most helpful to you.
Read a file from bottom to top
Many files on a UNIX system grow and grow until truncated or archived. For
instance, most important system processes, such as e-mail transport and remote
access, continuously log activity, appending each new entry to the end of the
file. And it's the end of the log file that's most interesting. If a
service crashes, the events that occurred at the very end provide the most clues.
There are two ways to display the lines in a file in reverse order:
tac (the reverse of cat) and
the command tail -r.
$ cat smallfile
a
b
c
$ tac smallfile
c
b
a
$ tail -r smallfile
c
b
a
|
You might find tac more practical, because it emits the
entire file, unlike tail, which truncates the output to
some number of lines. For instance, you can combine tac
and less to create an alias that pages files in
reverse:
$ alias rless="LESSOPEN='|tac %s' less"
$ rless smallfile
c
b
a
|
The rless alias temporarily sets LESSOPEN, an
environment variable specific to less, to |tac %s. This
forces each file (the %s is a placeholder for the file
name) to be pre-processed (hence the pipe, |) by
tac.
Here's another variation of the same trick, but one that leverages
perl instead of tac, which
might not be available on your system:
LESSOPEN="|perl -e 'print reverse (<>)' %s" less small
|
The line of perl says, "Read all input lines into an anonymous array
((<>), reverse the order of the elements,
and print the new array."
Do new math
If you need to calculate a result, there's no need to jump to a new application.
You can stay comfortably at the command line. You can use
dc, a reverse-polish notation calculator, or
bc, an entire scripting language for math. Or, if you
just need an answer fast, use the command line and the
$(( )) operator.
$ echo $(( 100 / 10 ))
10
$ echo $(( 10 ** 2 ))
100
|
The shell doesn't have a large collection of arithmetic operators, but it's
sufficient for most programming tasks, including bitwise shifts, remainders, and
comparisons.
Plenty of room to grow
"Speaking UNIX" might be thirteen, but there's still a lot to be experienced. There are
more commands and tricks to learn, a vast array of concepts to explore, not to
mention an enormous universe of open source software to boost your productivity.
Oh, and lest I forget, the braces have to come off. There's the ritual hazing by
upperclassmen, some really embarrassing moments, and going steady. Or perhaps I'm
showing my age . . . kids still go steady, right?
Thanks for reading. I hope you've enjoyed the column so far.
Resources Learn
-
Speaking
UNIX:
Check out other parts in this series.
-
Triskaidekaphobia: Read through
the origins of the number thirteen.
- Check out other articles and tutorials written
by Martin Streicher:
-
Popular content:
See what AIX® and UNIX content your peers find interesting.
- Search the AIX and UNIX library by topic:
-
AIX and UNIX:
The AIX and UNIX developerWorks zone provides a wealth of information relating to
all aspects of AIX systems administration and expanding your UNIX skills.
-
New to AIX
and UNIX?:
Visit the "New to AIX and UNIX" page to learn more about AIX and UNIX.
-
AIX
6 Wiki:
Discover a collaborative environment for technical information related to AIX.
-
Safari bookstore:
Visit this e-reference library to find specific technical resources.
-
developerWorks technical events and webcasts:
Stay current with developerWorks technical events and webcasts.
-
Podcasts:
Tune in and catch up with IBM technical experts.
Get products and technologies
-
IBM trial software:
Build your next development project with software for download directly from
developerWorks.
Discuss
- Participate in the
developerWorks blogs
and get involved in the developerWorks community.
- Participate in the AIX and UNIX forums:
About the author  | 
|  | Martin Streicher is the Chief Technology Officer of McClatchy Interactive and the Editor-in-Chief of
Linux Magazine
. Martin holds a Masters of Science degree in computer science from Purdue University and has been programming UNIX-like systems since 1986. You can reach Martin at martin.streicher@gmail.com. |
Rate this page
|