Windows-to-Linux roadmap
Part 9. Installing software
Using pre-compiled RPMs and compiling applications from source
Content series:
This content is part # of # in the series: Windows-to-Linux roadmap
This content is part of the series:Windows-to-Linux roadmap
Stay tuned for additional content in this series.
One of the first things you notice when you install Linux is that there are so many packages available with your distribution. Most distributions come with the Linux operating system, installation tools, and administration tools. Then they include Internet tools, development tools, office tools, games, and some things that you haven't even heard of. It is not uncommon for a Linux distribution to come with thousands of available packages. If you didn't select "install everything," then some subset of these packages were installed.
Now you may be wondering "How do I remove packages I don't want? How do I install things I missed? Can I use software that didn't come with my distribution?"
RPMs
As Linux installed, you probably noticed a lot of information about RPMs being installed. RPM stands for Redhat Package Manager, a contribution by Redhat that has become a standard for managing software on Redhat and UnitedLinux as well as on many other distributions.
Essentially, an RPM is a package, containing software for Linux ready to install and run on a particular machine architecture. For example, we installed the webmin package from an RPM in "Part 3. Introduction to Webmin." All of the software initially loaded in your distribution was installed from an RPM.
Anatomy of an RPM
An RPM is a package of files. It includes a .spec file, which provides information about the package, its function, and its dependencies (what packages must be in place before it can run). The .spec also contains a manifest of files in the package, where they must be loaded on the system, and what their initial permissions will be. The RPM also contains a pre-install script, which is written by the package developer. Then the RPM contains the compiled binary files. Finally, the RPM contains a post-install script.
RPM layout
- .spec
- pre-install script
- binary file
- binary file
- ...
- binary file
- post-install script
When an RPM is installed, the system first looks to see if the dependencies for the package are satisfied. If not, then the installation terminates unless you specify options to force an install anyway.
If all is clear, the pre-install script runs. This script can do anything. Normally it creates users and directories. However, it can do many types of dynamic configuration, even custom-compile source code for the running system.
If the pre-install script completes successfully, then the binary files are copied onto the system according to the manifest. Once all of the files have been copied and their permissions are set, then the post-install script is run. Again, this script can do almost anything.
Once all of that is completed, the information about the package is added to the RPM database, and the installation is complete. With this simple system, it is possible to perform all of the functions that could be done with a more elaborate commercial installer.
The RPM database
The piece of the RPM that adds elegance is the RPM database. This database typically lives in the /var/lib/rpm directory and holds information about every RPM installed on the system. The database knows the dependency relationships between packages and will warn if removing a package could cause other packages to break. The database knows about every file that was originally installed with a package and its original state on the system. It also knows the locations of the documentation and configuration files for each package. This may sound like a lot of information, and it is. But it isn't bloated and bulky. On a system containing 1,066 packages, comprised of 203,272 files, the database files are only 45 MB! RPM uses the database to check dependencies when packages are loaded and unloaded. Users can also query the database for information on packages.
Using RPM
The program to work with RPM packages is appropriately named
rpm
. rpm
runs in several different modes, but
the most common tasks are install, upgrade, query, verify, and erase.
rpm -i (install)
When you install a package for the first time, you will use the
-i
or install mode. Simply point the rpm to a binary package
and execute it. The rpm will be installed on your system. Installation
normally takes seconds. Often when installing a package, I will add the
-v
(verbose) switch to provide more information about the
process, and the -h
(hash bar) switch to provide progress
updates via hash (#) marks printed on the console as the package is
installed. Here's an example of installing a package:
Listing 1. Installing MyPackage
$ rpm -ivh MyPackage-1.0.0.i386.rpm Preparing... ########################################### [100%] 1:MyPackage ########################################### [100%]
That's it! MyPackage is now installed and ready to use.
rpm -e (erase)
To remove an installed package, use the -e
switch to erase
it. rpm
will use the database to remove all of the files for
the package. If there are other packages installed that depend on the one
you are removing, rpm
will abort. You will have to force the
erase with the nodeps
switch. (nodeps
can also
be used to force an installation.) Be very careful when using
this switch to force an install or erase. Removing packages that others
are dependent on can have unfortunate results. Here is the command to
remove the package we installed above:
$ rpm -e MyPackage
Notice that the full version of the package is not necessary to remove it. The full name was required at installation because we were pointing to a file name. Installed packages are referenced by their name only. The package's name is everything up to the version number.
rpm -V (verify)
The verify switch is very useful. It compares the current state of a package's files to their original state upon installation. Differences are shown using a code:
Results of verifying files
Code | Meaning |
---|---|
S | File Size differs |
M | Mode differs (includes permissions and file type) |
5 | MD5 sum differs |
D | Device major/minor number mis-match |
L | readLink(2) path mis-match |
U | User ownership differs |
G | Group ownership differs |
T | mTime differs |
If you were to run rpm -V
on a package and discover that the
size had changed for an executable, that would be a possible sign of a
security breach.
rpm -U (upgrade)
Once a package has been installed, any attempt to install a package with
the same name will result in a message that the package is already
installed. If you want to update a package to a later version, use the
-U
switch to upgrade. Upgrade has another affect. When
upgrade is run on multiple package names, it will try to put the packages
in order of dependencies. In other words, required packages will be
installed first. The upgrade switch can be used whether or not a package
is already installed, so many people use it for installs as well as
upgrades instead of using the -i
switch. Here is an example
of using the upgrade switch to load several rpm packages:
Listing 2. Interactive upgrade
$ rpm -Uvh My*.rpm Preparing... ########################################### [100%] 1:bMyPackageDep ########################################### [ 50%] 1:aMyPackageNew ########################################### [100%]
In the case above, bMyPackageDep was a prerequisite for aMyPackageNew, so
even though the file names sorted in reverse order, rpm
ordered them correctly.
rpm -q (query)
Several pieces of useful information can be queried from the rpm database.
Queries can be run by any user who has read access to the rpm database. By
default, all users have read access. To run a query, use the
-q
switch with the name of the package to query. This will
return the version of the package.
$ rpm -q MyPackage
MyPackage-1.0.0
The name of the package must be exactly correct. Wild cards are not
allowed. However, if you cannot remember the full name of a package, you
can use the grep
tool to help find it. Use the
-qa
switch to query all installed packages and pipe the
information through grep
with the text you can remember. For
example:
$ rpm -qa | grep IBM
IBMWSAppDev-Product-5.0-0
IBMWSSiteDevExp-Core-5.0-0
IBMWSSiteDev-Core-5.0-0
IBMWSTools-WAS-BASE-V5-5.0-0
IBMJava118-SDK-1.1.8-5.0
IBMWSWB-samples-5.0-0
IBMWSWB-5.0-0
IBMWSAppDev-Core-5.0-0
IBMWSAppDev-5.0-0
IBMWSTools-5.0-0
Besides version numbers, rpm -q
can provide other useful
information about a package. Here are some examples:
Getting information with an rpm query
Query switch | Information |
---|---|
rpm -q changelog | Shows the development change history for the package |
rpm -qc | Shows the configuration files for the package |
rpm -qd | Shows the documentation files for the package |
rpm -qi | Shows the package description |
rpm -ql | Shows a list of the package's files |
rpm -qR | Shows the dependencies for the package |
The query also has another interesting command which is run on files rather than packages.
rpm -q whatprovides <filename>
The above command will identify the package that is associated with the filename given. The filename must include the absolute path to the file, since that is how the information is stored in the rpm database.
RPM front ends
Working with rpm
from the console is easy, but sometimes it
is more convenient to work with a graphical interface. In typical Linux
style, there are front-end programs which provide an interface to the rpm
program. Each distribution has a front end, which will vary. Consult your
distribution documentation for information about the package management
tool provided.
Webmin software packages
Webmin also provides a simple Web-based front end for dealing with RPM packages.
Figure 1. Webmin RPM interface

Software can be easily installed erased and queried from here. Software
can also be installed directly from URL sites. If you have rpm enhancement
tools installed such as apt
or the Redhat Network, Webmin
will pick them up and provide an interface to them.
Source code
Since Linux is an open source operating system, it comes with all of the development tools required to compile software. While most of the packages that you work with will be provided as binary RPMs, you are not limited to only those packages. If you wish, you can download the raw source code and custom-compile it for your system.
You should be cautious about compiling from source on a production system as it may cause problems or void your support for commercial software which you are using on the system, such as IBM DB2. However, being familiar with compiling from source will allow you to apply patches to software and work with packages ported from other environments. Once you have compiled the code successfully, it is even possible to create your own RPM!
Corewars source demonstration
To demonstrate how simple it can be to compile from source, we will compile a simulation game called Corewars (see Related topics for a link). Here's a note about Corewars from their Web site: "Corewars is a simulation game where a number of warriors try to crash each other while they are running in a virtual computer. The warriors can be written in one of two assembler-like languages called Corewars and Redcode. Corewars is the default language and is easier to learn and understand. Redcode provides more advanced and powerful instructions but also requires more time to learn."
The first step to compiling from source is to download the source code package from the Web site:
Once the code is downloaded, I expand the package.
tar -xvzf corewars-0.9.13.tar.gz
The file is expanded into my current directory. The standard approach is for the source code to be contained in a directory which matches the product name. In this case, it's in a directory called corewars-0.9.13.
I enter into that directory and find the source code, some documentation, configure scripts and README files. Most source packages will come with a file called INSTALL and one called README. You should read these materials before you compile the software. They will usually save you a lot of pain by identifying problems before you have them and advising you of the correct procedures for compiling and installation. Most problems I have had compiling from source were simply because I didn't follow the directions.
The most common next step is to run the configure
script.
configure
is part of the Autoconf package, included with the
development tools of your Linux distribution. Quoting Autoconf's package
description: "GNU's Autoconf is a tool for configuring source code and
Makefiles. Using Autoconf, programmers can create portable and
configurable packages, since the person building the package is allowed to
specify various configuration options."
The configure
script runs a series of tests on the system to
determine the best way to compile the package for your distribution and
architecture. It then createx a custom Makefile for your system. If there
are problems with compiling on your system, configure
tells
you. configure
will usually let you customize the features to
be included in the compile, or let you provide parameters about locations
of libraries or other needed files so that the package can be compiled
successfully. Here we execute configure
with no additional
parameters:
./configure
Several tests run on the system ultimately end with success. Now build the program using:
make
If the compile has errors, I will need to determine the problems and fix them. This can be non-trivial and may require a good deal of knowledge about your environment and programming in general. If all goes well, we typically install the software with:
make install
The files are copied into the correct areas of the system, file permissions are updated, configuration files are copied and documentation is added to the manual pages.
Now let's test our handiwork by executing the program. It is a graphical
program, so you will need to have X running when you start it. The
make install
which we did above should put the program in our
executable path.
corewars
A graphical screen should appear to reward us.
Figure 2. Success!

The topic of corewars rules is outside of the scope of this article, but
you will find documentation about this interesting simulation game in the
man page (man corewars
).
The corewars compile was a typical scenario. There are many possible
variations including using switches on the configure
script
to adjust the features that are compiled into the program, using different
commands from the Makefile to adjust how the compile is done, and others.
Since this program was not installed using rpm, there are no entries in the rpm database. If a program doesn't work out after it's been installed, most Makefiles include an uninstall parameter to remove the software:
make uninstall
Bear in mind that working with raw source code does not enter anything into the RPM database. Software installed in this way is unmanaged, so it should be done with care.
Source RPMs
When an RPM is created, there is an artifact called a Source RPM. This is a SPEC file combined with source code designed to build on one or more architectures. This is the best of both worlds! With a source RPM, you can custom compile the software on your system, but the finished product will be an installable RPM rather than the raw binaries. Most packages that are available as a pre-compiled RPM are also available as a SRPM. This can be a simple way to move software across platforms in Linux. When you have success recompiling onto a different platform, consider sharing your finished RPMs with the community.
May the source be with you
If you are new to Linux, installing software has a different approach than what you are used to. However, the RPM approach to installation is elegant and provides new power which you will soon grow to appreciate.
You should become familiar with the options for working with rpms from the console, but for daily use there are front-end interface options to make rpms easier to manage. One was provided by your distribution, and others are available, such as the one in Webmin.
You are not limited by pre-compiled rpms, though. You can take advantage of the open source nature of Linux to compile applications directly from the source code. Compiling is generally easy for a mature project. Remember that code installed from source code will not have an entry in your rpm database. When working with source, consider using source rpms, which combine the power of compiled source with the manageability of rpms.
Downloadable resources
Related topics
- Check out the other parts in the Windows-to-Linux roadmap series (developerWorks, November 2003).
- The IBM developerWorks tutorials LPI certification exam prep are very useful. Part 1 covers Red Hat and Debian package managers as well ascompiling programs from sources and managing shared libraries. Part 2 covers important kernel configuration and in-kernel PCI and USB support.
- Learn how to create your own packages with the IBM developerWorks feature Packaging software with RPM.
- SourceForge is an excellent resource for open source projects; you can add your own projects as well.
- The Corewars project is just one of many posted at SourceForge.