Speaking UNIX, Part 12: Do-it-yourself projects

Build software from source code

If your UNIX® system lacks a tool you need, chances are you can find an apt solution in the enormous inventory of software available online. This month, learn how to build software from source code.

Martin Streicher (martin.streicher@gmail.com), Chief Technology Officer, McClatchy Interactive

Photo of Martin StreicherMartin Streicher is a freelance Ruby on Rails developer and the former Editor-in-Chief of Linux Magazine. Martin holds a Masters of Science degree in computer science from Purdue University and has programmed UNIX-like systems since 1986. He collects art and toys. You can reach Martin at martin.streicher@gmail.com.



21 August 2007

Also available in Russian

UNIX® systems have hundreds of utility applications or commands. Some commands manipulate the file system, while others query and control the operating system itself. A healthy number of commands provide connectivity, and an even larger set of commands can generate, permute, modify, filter, and analyze data. Given the long and rich history of UNIX, chances are your system has just the right tool for the task at hand.

Moreover, when a single utility doesn't suffice, you can combine any number of UNIX utilities in a variety of ways to create your own tool. As you've seen previously, you can leverage pipes, redirection, and conditionals to build an impromptu tool immediately on the command line, and shell scripts combine the power of a small, easy-to-learn programming language with the UNIX commands to build a tool you can reuse over and over again.

Of course, there are times when neither the command line nor a shell script is adequate. For example, if you must deploy a new daemon to provide a new network service, you might switch to a rich language, such as C or Python, to write the application yourself. And because so many applications are freely available on the Internet—freely meaning no cost, licensed under liberal terms, or both—you can also download, build, and install a suitable, working solution to meet your requirements.

Many versions of UNIX (and Linux®) provide a special tool called a package manager to add, remove, and maintain software on the system. A package manager typically maintains an inventory of all software installed locally, as well as a catalog of all software available in one or more remote repositories. You can use the package manager to search the repositories for the software you need. If the repository contains what you're looking for, all it takes is one command or a few clicks of the mouse to install a new package on your system.

A package manager is invaluable. With it, you can remove entire packages, update existing packages, and automatically detect and fulfill any prerequisites for any package. For example, if you choose software to manipulate images, such as the stalwart ImageMagick, but your system lacks the library to process JPEG images, the package manager detects and installs what is missing before it installs your package.

Yet, there are also instances where the software you need is available but is not (yet) part of any repository. Given the predominance of package management, most software comes bundled in a form you can download and install using the package manager. However, because any number of versions and flavors of UNIX are available, it can be difficult to offer every application in each package manager format for each particular variation. If your UNIX installation is mainstream and enjoys a large, popular following, chances are better that you'll find the software prebuilt and ready to use. Otherwise, it's time to roll up your sleeves and prepare to build the software yourself.

Yes, young Jedi, it's time to use the source code.

Like lifting an X-wing fighter from a swamp, building software from source might seem intimidating at first, especially if you're not a software developer. In fact, in most cases, the entire process takes but a handful of commands, and the rest is automated.

To be sure, some programs are complex to build—or take hours to build—and require manual intervention along the way. However, even these programs are typically constructed from smaller pieces that are simple to build. It's the number of dependencies and the sequence of construction that complicate the build process. Some programs also have oodles of features that you might or might not want. For instance, you can build PHP to interoperate with the new Internet Protocol version 6 (IPv6) Internet addressing scheme. If your network has yet to adopt IPv6, there's no need to include that feature. Vetting a plethora of options adds effort to the build process.

This month, examine how to build a typical UNIX software application. Before you proceed, make sure that your system has a C compiler, such as the GNU Compiler Collection, and the suite of common UNIX software development tools, including make, m4, pkg-config, and awk. In addition, ensure that all the development tools are in your PATH environment variable.

Good things come in software packages

As an illustrative and representative example, let's configure, build, and install SQLite—a small library that implements a Structured Query Language (SQL) database engine. SQLite requires no configuration to use and can be embedded in its entirety in any application, and databases are contained in a single file. Many programming languages can call SQLite to persist data. SQLite also includes a command-line utility aptly named sqlite3 that manages SQLite databases.

To begin, download SQLite (see Resources). Pick the most current source code bundle, and download it to your machine. (As of this writing, the most recent version of SQLite was version 3.3.17, released on 25 April 2007.) This example uses the file stored as http://www.sqlite.org/sqlite-3.3.17.tar.gz.

When you have the file, unpack it. The .tar.gz extension reflects how the archive was constructed. In this case, it's a gzipped, tar archive. The latter extension, .gz, stands for gzip (compression); the former extension, .tar, stands for tar (an archive format). To extract the contents of the archive, simply process the file in reverse order—first extracting it and then opening the archive:

$ gunzip sqlite-3.3.17.tar.gz
$ tar xvf sqlite-3.3.17.tar

These two commands create a replica of the original source code in a new directory named sqlite-3.3.17. By the way, the .tar.gz file format is quite common (it's called a tarball), and you can unpack a tarball using the tar command directly:

$ tar xzvf sqlite-3.3.17.tar.gz

This single command is equivalent to the two previous commands.

Next, change the directory to sqlite-3.3.17, and use ls to list the contents. You should see a manifest like Listing 1:

Listing 1. A manifest of the SQLite package
$ ls
Makefile.in             contrib                 publish.sh
Makefile.linux-gcc      doc                     spec.template
README                  ext                     sqlite.pc.in
VERSION                 install-sh              sqlite3.1
aclocal.m4              ltmain.sh               sqlite3.pc.in
addopcodes.awk          main.mk                 src
art                     mkdll.sh                tclinstaller.tcl
config.guess            mkopcodec.awk           test
config.sub              mkopcodeh.awk           tool
configure               mkso.sh                 www
configure.ac            notes

The source code and supplemental files for SQLite are well organized and model how most software projects distribute source code:

  • A README file describes the project and usually explains how to build the software. (The README file also details what terms of use, or license, applies. A significant number of projects license code according to the terms of the GNU Public License version 2—the so-called "copyleft" license. If you have any questions about conflicts between the license and how you intend to use the software, it's best to consult competent counsel.)
  • The src directory contains the code.
  • The test directory contains a suite of tests to validate the proper operation of the software. Running the tests after the initial build or after any modification provides confidence in the software.
  • The contrib directory contains additional software that the core SQLite development team didn't provide. For a library such as SQLite, contrib might contain programming interfaces for popular languages such as C, Perl, PHP, and Python. It might also include graphical user interface (GUI) wrappers and more.
  • Among the other files, Makefile.in, configure, configure.ac, and aclocal.m4 are used to generate the scripts and rules to build the SQLite software on your flavor of UNIX. If the software is simple enough, a quick compile command might be all that's required to build the code. But because so many variations of UNIX exist—Mac OS X, Solaris, Linux, IBM® AIX®, and HP/UX, among others—it's necessary to investigate the host machine to determine both its capabilities and its implementations. For example, a mail reader application might attempt to determine how the local system stores mailboxes and include support for the format.

Concentrate. Concentrate. Feel the source flow through you.

The next step is to probe the system and configure the software to build properly. (You can think of this step as tailoring a suit: The garment is largely the right size but needs some alteration to fit stylishly.) You customize and prepare for the build with the ./configure local script. At the command-line prompt, type:

$ ./configure

The configure script conducts several tests to qualify your system. For instance, running ./configure on an Apple MacBook computer (which runs a variation of FreeBSD® UNIX) produces the following (see Listing 2):

Listing 2. The result of running ./configure on Mac OS X
checking build system type... i386-apple-darwin8.9.1
checking host system type... i386-apple-darwin8.9.1
checking for gcc... gcc
checking for C compiler default output file name... a.out
checking whether the C compiler works... yes
checking whether we are cross compiling... no
checking for suffix of executables... 
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
checking for a sed that does not truncate output... /usr/bin/sed
checking for grep that handles long lines and -e... /usr/bin/grep
checking for egrep... /usr/bin/grep -E
checking for ld used by gcc... /usr/bin/ld
...

Here, ./configure determines the build and host system type (which can differ if you're cross-compiling), confirms that the GNU C Compiler (GCC) is installed, and finds the paths to utilities the rest of the build process might require. You can scan through the rest of your output, but you'll see a long list of diagnostics that characterize your system to the extent needed to construct SQLite successfully.

Note: The ./configure command can fail, especially if a prerequisite—a system library or critical system utility, say—cannot be found.

Scan the output of ./configure, looking for anomalies, such as specialized or local versions of commands, that might not be appropriate to build a general application such as SQLite. As an example, if your systems administrator installed an alpha version of GCC and the configure tool prefers to use it, you might choose to manually override the choice. To see a list (often long) of options you can override, type ./configure --help, as shown in Listing 3:

Listing 3. General options for the ./configure script
$ ./configure --help
...
By default, `make install' will install all the files in
`/usr/local/bin', `/usr/local/lib' etc. You can specify
an installation prefix other than `/usr/local' using `--prefix',
for instance `--prefix=$HOME'.

For better control, use the options below.

Fine tuning of the installation directories:
  --bindir=DIR           user executables [EPREFIX/bin]
  --sbindir=DIR          system admin executables [EPREFIX/sbin]
  --libexecdir=DIR       program executables [EPREFIX/libexec]
...

The output of ./configure --help includes general options used with the configuration system and specific options pertinent only to the software you're building. To see the latter (shorter) list, type ./configure --help=short (see Listing 4):

Listing 4. Package-specific options for the software to build
$ ./configure --help=short
Optional Features:
  --disable-FEATURE       do not include FEATURE (same as --enable-FEATURE=no)
  --enable-FEATURE[=ARG]  include FEATURE [ARG=yes]
  --enable-shared[=PKGS]  build shared libraries [default=yes]
  --enable-static[=PKGS]  build static libraries [default=yes]
  --enable-fast-install[=PKGS]
                          optimize for fast installation [default=yes]
  --disable-libtool-lock  avoid locking (might break parallel builds)
  --enable-threadsafe     Support threadsafe operation
  --enable-cross-thread-connections
                          Allow connection sharing across threads
  --enable-threads-override-locks
                          Threads can override each others locks
  --enable-releasemode    Support libtool link to release mode
  --enable-tempstore      Use an in-ram database for temporary tables
                          (never,no,yes,always)
  --disable-tcl           do not build TCL extension
  --disable-readline      disable readline support [default=detect]
  --enable-debug          enable debugging & verbose explain

Returning to ./configure --help, the output at the very top indicates that the default installation directory for executables is /usr/local/bin, the default installation directory for libraries is /usr/local/lib, and so on. Many systems use an alternate hierarchy to store non-core software.

For example, many systems administrators choose to use /opt instead of /usr/local as the locus of locally added or locally modified software. If you want to install SQLite in a directory other than the default, specify the directory with the --prefix= option. One possible use—and a common one if you're the only person using a package or if you don't have root access to install the software globally—is to install the software in your own hierarchy within your home directory:

$ ./configure --prefix=$HOME/sw

Using this command, the install portion of the build would recreate the hierarchy of the software in $HOME/sw, as in $HOME/sw/bin, $HOME/sw/lib, $HOME/sw/etc, $HOME/sw/man, and others as needed. For simplicity, this example installs its code in the default targets.

Compile the code

The result of ./configure is a Makefile compatible with your version of UNIX. The development utility named make uses the Makefile to execute the steps required to compile and link the code into an executable. You can open the Makefile to examine it, but don't edit it, because any modifications you make will be listed if you run ./configure again.

The Makefile contains a list of source files to build, and it also includes constants that enable or disable and choose certain snippets of code in the SQLite package. For instance, code specific to 64-bit processors might be enabled if the configure tool detected a suitable chip within your system. The Makefile also expresses dependencies among source files, so a change in an all-important header (.h) file might cause recompilation of all the C source code.

Your next step is to run make to build the software (see Listing 5):

Listing 5. Running make
$ make 
sed -e s/--VERS--/3.3.17/ ./src/sqlite.h.in | \
  sed -e s/--VERSION-NUMBER--/3003017/ >sqlite3.h

gcc -g -O2 -o lemon ./tool/lemon.c

cp ./tool/lempar.c .

cp ./src/parse.y .

./lemon  parse.y

mv parse.h parse.h.temp

awk -f ./addopcodes.awk parse.h.temp >parse.h

cat parse.h ./src/vdbe.c | awk -f ./mkopcodeh.awk >opcodes.h

./libtool --mode=compile --tag=CC gcc -g -O2 -I. -I./src \
  -DNDEBUG  -I/System/Lib rary/Frameworks/Tcl.framework/Versions/8.4/Headers \
  -DTHREADSAFE=0 -DSQLITE_THREA D_OVERRIDE_LOCK=-1 \
  -DSQLITE_OMIT_LOAD_EXTENSION=1 -c ./src/alter.c

mkdir .libs

gcc -g -O2 -I. -I./src -DNDEBUG \
  -I/System/Library/Frameworks/Tcl.framework/Vers ions/8.4/Headers \
  -DTHREADSAFE=0 -DSQLITE_THREAD_OVERRIDE_LOCK=-1 \
  -DSQLITE_OMIT_L OAD_EXTENSION=1 -c ./src/alter.c  -fno-common \
  -DPIC -o .libs/alter.o
...
ranlib .libs/libtclsqlite3.a
creating libtclsqlite3.la

Note: In the output above, blank lines have been added to better highlight each step that make initiates.

The make utility checks the modification dates of files—header files, source code, data files, and object files—and builds the C source files that are appropriate. Initially, make rebuilds everything, because no object files or build targets exist. As you can see, the rules to build the targets include intermediate steps, too, that use tools, such as sed and awk, to produce header files that are used in later steps.

The result of the make command is a finished library and the sqlite3 utility.

Although not mandatory nor provided in every package, it's a good idea to test the software you just built. Even if your software builds successfully, it's not necessarily an indication that the software functions properly.

To test your software, run make again with the test option (see Listing 6):

Listing 6. Testing the software
$ make test
...
alter-1.1... Ok
alter-1.2... Ok
alter-1.3... Ok
alter-1.3.1... Ok
alter-1.4... Ok
...
Thread-specific data deallocated properly
0 errors out of 28093 tests
Failures on these tests:

Success! The software built fine and works correctly. If one or more test cases did fail, the summary at the bottom (here, it's blank) would report which test or tests require investigation.

A finished product

If your software works properly, the final step is to install it on your system. Once again, use make and specify the install target. Adding software to /usr/local usually requires superuser (root) privileges provided by sudo (see Listing 7):

Listing 7. Installing the software on your local system
$ sudo make install 
tclsh ./tclinstaller.tcl 3.3
/usr/bin/install -c -d /usr/local/lib
./libtool --mode=install /usr/bin/install 
    -c libsqlite3.la /usr/local/lib /usr/bin/install 
    -c .libs/libsqlite3.0.8.6.dylib /usr/local/lib/libsqlite3.0.8.6 .dylib
...
/usr/bin/install -c .libs/libsqlite3.lai /usr/local/lib/libsqlite3.la
/usr/bin/install -c .libs/libsqlite3.a /usr/local/lib/libsqlite3.a
chmod 644 /usr/local/lib/libsqlite3.a
ranlib /usr/local/lib/libsqlite3.a
...
/usr/bin/install -c -d /usr/local/bin
./libtool --mode=install /usr/bin/install -c sqlite3 /usr/local/bin
/usr/bin/install -c .libs/sqlite3 /usr/local/bin/sqlite3
/usr/bin/install -c -d /usr/local/include
/usr/bin/install -c -m 0644 sqlite3.h /usr/local/include
/usr/bin/install -c -m 0644 ./src/sqlite3ext.h /usr/local/include
/usr/bin/install -c -d /usr/local/lib/pkgconfig; 
/usr/bin/install -c -m 0644 sqlite3.pc /usr/local/lib/pkgconfig;

The make install process creates the necessary directories (if each doesn't exist), copies the files to the destinations, and runs ranlib to prepare the library for use by applications. It also copies the sqlite3 utility to /usr/local/bin, copies header files that developers require to build software against the SQLite library, and copies the documentation to the proper place in the hierarchy.

Assuming that /usr/local/bin is in your PATH variable, you can now run sqlite3 (see Listing 8):

Listing 8. SQLite, ready to use
$ which sqlite3
/usr/local/bin/sqlite3
$ sqlite3
SQLite version 3.3.17
Enter ".help" for instructions
sqlite>

Advice for the apprentice?

A fair majority of software packages build as readily as SQLite. Indeed, you can often configure, build, and install the software with one command:

$ ./configure && make && sudo make install

The && operator runs the latter command only if the former command works without error. So, the command above says, "Run ./configure, and if that works, run make, and if that works, run sudo make install." This one command builds a package unattended. Just kick it off and go get coffee, a sandwich, or a prix fixe meal, depending on the size and complexity of the package you're building.

Here are some other helpful tips for building software from source code:

  • If the software package you're building requires more than the typical ./configure && make && sudo make install, keep a journal of the steps you followed to build the code. If you must rebuild the same code or build a newer version of the code, you can refer to your journal to refresh your memory. Store the journal in the same directory as the package's README file. You might even adopt a convention for the journal's file name, which makes it easy to recognize what you've built previously.
  • Better yet, if the steps required to build the software are repeatable without manual intervention, capture the process in a shell script. Later, if you must rebuild the same code, simply run the shell script. If a newer version of the code becomes available, you can modify the script as needed to add, change, or remove steps.
  • You can reclaim disk space after you've installed the software by using make clean. This rule usually removes the target files and any intermediate files, and it leaves the files required to restart the process intact. Another rule, make distclean, removes the Makefile and other generated files.
  • Keep the source of differing versions of the same code separate. This regimen allows you to compare one release to another, but it also allows you to recover a specific version of the software. Organize the source code into a local repository, say $HOME/src or /usr/local/src, depending on your scope of use (personal or global) and your local conventions.
  • Further, you might choose to prevent accidental removal or overwrites by making the source code globally read-only. Change to the directory of the source code you want to protect, and run the chmod -R a-w * command (run chmod recursively, turning off all write permissions).

Finally, there will be instances when source code simply won't build on your system. As mentioned above, the most frequent obstacle encountered is missing prerequisites. Read the error message or messages carefully—it might be obvious what has gone wrong.

If you cannot deduce the reason, type the exact error message and the name of the package you're trying to build into Google. Chances are very good that someone else has encountered and solved the same issue. (In fact, searching the Internet for error messages can be quite illuminating—although you might have to dig a little to find a gem.)

If you get stumped, check the software's home page for links to resources such as an IRC channel, a newsgroup, a mailing list, or an FAQ. Your local systems administrator is an invaluable font of experience, too.

The source is strong with this one

If your system lacks a tool you need, you can ad lib one on the command line, you can write a shell script, you can write your own program, and you can borrow from the enormous pool of code found online. You'll be well on your way to practicing Jedi mind tricks just like me.

"This is the best article I've ever read."

Resources

Learn

Get products and technologies

  • About SQLite: You can download SQLite from here.
  • IBM trial software: Build your next development project with software for download directly from developerWorks.

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into AIX and Unix on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=AIX and UNIX
ArticleID=249377
ArticleTitle=Speaking UNIX, Part 12: Do-it-yourself projects
publish-date=08212007