One of the first things you notice when you install Linux is that there are so many packages available with your distribution. Most distributions come with the Linux operating system, installation tools, and administration tools. Then they include Internet tools, development tools, office tools, games, and some things that you haven't even heard of. It is not uncommon for a Linux distribution to come with thousands of available packages. If you didn't select "install everything," then some subset of these packages were installed.
Now you may be wondering "How do I remove packages I don't want? How do I install things I missed? Can I use software that didn't come with my distribution?"
As Linux installed, you probably noticed a lot of information about RPMs being installed. RPM stands for Redhat Package Manager, a contribution by Redhat that has become a standard for managing software on Redhat and UnitedLinux as well as on many other distributions.
Essentially, an RPM is a package, containing software for Linux ready to install and run on a particular machine architecture. For example, we installed the webmin package from an RPM in "Part 3. Introduction to Webmin." All of the software initially loaded in your distribution was installed from an RPM.
An RPM is a package of files. It includes a .spec file, which provides information about the package, its function, and its dependencies (what packages must be in place before it can run). The .spec also contains a manifest of files in the package, where they must be loaded on the system, and what their initial permissions will be. The RPM also contains a pre-install script, which is written by the package developer. Then the RPM contains the compiled binary files. Finally, the RPM contains a post-install script.
| .spec | pre-install script | binary file | binary file | ... | binary file | post-install script |
When an RPM is installed, the system first looks to see if the dependencies for the package are satisfied. If not, then the installation terminates unless you specify options to force an install anyway.
If all is clear, the pre-install script runs. This script can do anything. Normally it creates users and directories. However, it can do many types of dynamic configuration, even custom-compile source code for the running system.
If the pre-install script completes successfully, then the binary files are copied onto the system according to the manifest. Once all of the files have been copied and their permissions are set, then the post-install script is run. Again, this script can do almost anything.
Once all of that is completed, the information about the package is added to the RPM database, and the installation is complete. With this simple system, it is possible to perform all of the functions that could be done with a more elaborate commercial installer.
The piece of the RPM that adds elegance is the RPM database. This database typically lives in the /var/lib/rpm directory and holds information about every RPM installed on the system. The database knows the dependency relationships between packages and will warn if removing a package could cause other packages to break. The database knows about every file that was originally installed with a package and its original state on the system. It also knows the locations of the documentation and configuration files for each package. This may sound like a lot of information, and it is. But it isn't bloated and bulky. On a system containing 1,066 packages, comprised of 203,272 files, the database files are only 45 MB! RPM uses the database to check dependencies when packages are loaded and unloaded. Users can also query the database for information on packages.
The program to work with RPM packages is appropriately named rpm. rpm runs
in several different modes, but the most common tasks are install,
upgrade, query, verify, and erase.
When you install a package for the first time, you will use the -i
or install mode. Simply point the rpm to a binary package and execute it.
The rpm will be installed on your system. Installation normally takes
seconds. Often when installing a package, I will add the -v
(verbose) switch to provide more information about the process, and the -h (hash bar) switch to provide progress updates via hash (#) marks printed on the console as the package is installed. Here's an example of installing a package:
Listing 1. Installing MyPackage
$ rpm -ivh MyPackage-1.0.0.i386.rpm Preparing... ########################################### [100%] 1:MyPackage ########################################### [100%] |
That's it! MyPackage is now installed and ready to use.
To remove an installed package, use the -e switch to erase it.
rpm will use the database to remove all of the files for the
package. If there are other packages installed that depend on the one you
are removing, rpm will abort. You will have to force the erase with the
nodeps switch. (nodeps can also be used to force an
installation.) Be very careful when using this switch to force an
install or erase. Removing packages that others are dependent on can have
unfortunate results. Here is the command to remove the package we
installed above:
$ rpm -e MyPackage
Notice that the full version of the package is not necessary to remove it. The full name was required at installation because we were pointing to a file name. Installed packages are referenced by their name only. The package's name is everything up to the version number.
The verify switch is very useful. It compares the current state of a package's files to their original state upon installation. Differences are shown using a code:
S | File Size differs |
M | Mode differs (includes permissions and file type) |
5 | MD5 sum differs |
D | Device major/minor number mis-match |
L | readLink(2) path mis-match |
U | User ownership differs |
G | Group ownership differs |
T | mTime differs |
If you were to run rpm -V on a package and
discover that the size had changed for an executable, that would be a
possible sign of a security breach.
Once a package has been installed, any attempt to install a package with
the same name will result in a message that the package is already
installed. If you want to update a package to a later version, use the
-U switch to upgrade. Upgrade has another affect. When upgrade is
run on multiple package names, it will try to put the packages in order of
dependencies. In other words, required packages will be installed first.
The upgrade switch can be used whether or not a package is already
installed, so many people use it for installs as well as upgrades instead of using the -i switch. Here is an example of using the upgrade switch to load several
rpm packages:
Listing 2. Interactive upgrade
$ rpm -Uvh My*.rpm Preparing... ########################################### [100%] 1:bMyPackageDep ########################################### [ 50%] 1:aMyPackageNew ########################################### [100%] |
In the case above, bMyPackageDep was a prerequisite for aMyPackageNew, so
even though the file names sorted in reverse order, rpm ordered them
correctly.
Several pieces of useful information can be queried from the rpm
database. Queries can be run by any user who has read access to the rpm
database. By default, all users have read access. To run a query, use the
-q switch with the name of the package to query. This will return
the version of the package.
$ rpm -q MyPackage
MyPackage-1.0.0
The name of the package must be exactly correct. Wild cards are not
allowed. However, if you cannot remember the full name of a package, you
can use the grep tool to help find it. Use the
-qa switch to query all installed packages and pipe the information
through grep with the text you can remember. For example:
$ rpm -qa | grep IBM
IBMWSAppDev-Product-5.0-0
IBMWSSiteDevExp-Core-5.0-0
IBMWSSiteDev-Core-5.0-0
IBMWSTools-WAS-BASE-V5-5.0-0
IBMJava118-SDK-1.1.8-5.0
IBMWSWB-samples-5.0-0
IBMWSWB-5.0-0
IBMWSAppDev-Core-5.0-0
IBMWSAppDev-5.0-0
IBMWSTools-5.0-0
Besides version numbers, rpm -q can provide
other useful information about a package. Here are some examples:
Getting information with an rpm query
rpm -q changelog | Shows the development change history for the package |
rpm -qc | Shows the configuration files for the package |
rpm -qd | Shows the documentation files for the package |
rpm -qi | Shows the package description |
rpm -ql | Shows a list of the package's files |
rpm -qR | Shows the dependencies for the package |
The query also has another interesting command which is run on files rather than packages.
rpm -q whatprovides <filename>
The above command will identify the package that is associated with the filename given. The filename must include the absolute path to the file, since that is how the information is stored in the rpm database.
Working with rpm from the console is easy, but
sometimes it is more convenient to work with a graphical interface. In
typical Linux style, there are front-end programs which provide an
interface to the rpm program. Each distribution has a front end, which
will vary. Consult your distribution documentation for information about
the package management tool provided.
Webmin also provides a simple Web-based front end for dealing with RPM packages.
Figure 1. Webmin RPM interface
Software can be easily installed erased and queried from here. Software
can also be installed directly from URL sites. If you have rpm enhancement
tools installed such as apt or the Redhat
Network, Webmin will pick them up and provide an interface to them.
Since Linux is an open source operating system, it comes with all of the development tools required to compile software. While most of the packages that you work with will be provided as binary RPMs, you are not limited to only those packages. If you wish, you can download the raw source code and custom-compile it for your system.
You should be cautious about compiling from source on a production system as it may cause problems or void your support for commercial software which you are using on the system, such as IBM DB2. However, being familiar with compiling from source will allow you to apply patches to software and work with packages ported from other environments. Once you have compiled the code successfully, it is even possible to create your own RPM!
To demonstrate how simple it can be to compile from source, we will compile a simulation game called Corewars (see Resources for a link). Here's a note about Corewars from their Web site: "Corewars is a simulation game where a number of warriors try to crash each other while they are running in a virtual computer. The warriors can be written in one of two assembler-like languages called Corewars and Redcode. Corewars is the default language and is easier to learn and understand. Redcode provides more advanced and powerful instructions but also requires more time to learn."
The first step to compiling from source is to download the source code package from the Web site:
Once the code is downloaded, I expand the package.
tar -xvzf corewars-0.9.13.tar.gz
The file is expanded into my current directory. The standard approach is for the source code to be contained in a directory which matches the product name. In this case, it's in a directory called corewars-0.9.13.
I enter into that directory and find the source code, some documentation, configure scripts and README files. Most source packages will come with a file called INSTALL and one called README. You should read these materials before you compile the software. They will usually save you a lot of pain by identifying problems before you have them and advising you of the correct procedures for compiling and installation. Most problems I have had compiling from source were simply because I didn't follow the directions.
The most common next step is to run the configure script.
configure is part of the Autoconf package, included with the
development tools of your Linux distribution. Quoting Autoconf's package
description: "GNU's Autoconf is a tool for configuring source code and Makefiles.
Using Autoconf, programmers can create portable and configurable packages,
since the person building the package is allowed to specify various
configuration options."
The configure script runs a series of tests on the system to
determine the best way to compile the package for your distribution and
architecture. It then createx a custom Makefile for your
system. If there are problems with compiling on your system, configure
tells you. configure will usually let you customize the features to be
included in the compile, or let you provide parameters about locations of
libraries or other needed files so that the package can be compiled
successfully. Here we execute configure
with no additional parameters:
./configure
Several tests run on the system ultimately end with success. Now build the program using:
make
If the compile has errors, I will need to determine the problems and fix them. This can be non-trivial and may require a good deal of knowledge about your environment and programming in general. If all goes well, we typically install the software with:
make install
The files are copied into the correct areas of the system, file permissions are updated, configuration files are copied and documentation is added to the manual pages.
Now let's test our handiwork by executing the program. It is a graphical
program, so you will need to have X running when you start it. The make install which we did above should put the
program in our executable path.
corewars
A graphical screen should appear to reward us.
Figure 2. Success!
The topic of corewars rules is outside of the scope of this article, but
you will find documentation about this interesting simulation game in the
man page (man corewars).
The corewars compile was a typical scenario. There are many
possible variations including using switches on the configure script to
adjust the features that are compiled into the program, using different
commands from the Makefile to adjust how the compile is done, and others.
Since this program was not installed using rpm, there are no entries in the rpm database. If a program doesn't work out after it's been installed, most Makefiles include an uninstall parameter to remove the software:
make uninstall
Bear in mind that working with raw source code does not enter anything into the RPM database. Software installed in this way is unmanaged, so it should be done with care.
When an RPM is created, there is an artifact called a Source RPM. This is a SPEC file combined with source code designed to build on one or more architectures. This is the best of both worlds! With a source RPM, you can custom compile the software on your system, but the finished product will be an installable RPM rather than the raw binaries. Most packages that are available as a pre-compiled RPM are also available as a SRPM. This can be a simple way to move software across platforms in Linux. When you have success recompiling onto a different platform, consider sharing your finished RPMs with the community.
If you are new to Linux, installing software has a different approach than what you are used to. However, the RPM approach to installation is elegant and provides new power which you will soon grow to appreciate.
You should become familiar with the options for working with rpms from the console, but for daily use there are front-end interface options to make rpms easier to manage. One was provided by your distribution, and others are available, such as the one in Webmin.
You are not limited by pre-compiled rpms, though. You can take advantage of the open source nature of Linux to compile applications directly from the source code. Compiling is generally easy for a mature project. Remember that code installed from source code will not have an entry in your rpm database. When working with source, consider using source rpms, which combine the power of compiled source with the manageability of rpms.
- Check out the other parts in the Windows-to-Linux roadmap series (developerWorks, November 2003).
-
The IBM developerWorks tutorials
LPI
certification exam prep are very useful. Part 1 covers Red Hat and Debian package managers as well ascompiling programs from sources and managing shared libraries. Part 2 covers important kernel configuration and in-kernel PCI and USB support.
-
The IBM developerWorks tutorial
Compiling
and installing software from sources goes into more depth on
unpacking, inspecting, configuring and installing software packages from
source code.
-
Learn more about compiling and installing source code on Linux with the
IBM developerWorks feature, Compiling
and installing software from sources .
-
Learn how to create your own packages with the IBM developerWorks
features Create
Debian Linux packages and Packaging
software with RPM.
-
Learn about an alternative to RPM and Debian's package management
solutions in the IBM developerWorks feature Manage
packages using Stow.
-
Regularly upgrading software is an important part of security. Learn how
to keep your system up to date quickly and easily with the IBM
developerWorks Tip on Upgrading
applications from sources.
-
Learn more about United
Linux in this IBM developerWorks feature.
-
SourceForge is an excellent
resource for open source projects; you can add
your own projects as well.
-
The Corewars project is
just one of many posted at SourceForge.
-
The developerWorks Open
source zone hosts open-source projects at IBM.
-
IBM's alphaWorks offers
early access to emerging IBM technologies including things like the Web
services and Grid toolboxes.
-
The IBM Speed-start
your Linux app 2003 initiative offers a free Linux Software Evaluation
Kit (SEK) which includes DB2 Universal Database, WebSphere Application
Server, WebSphere Studio Site Developer, WebSphere MQ, Lotus Domino,
Tivoli Access Manager -- and more!
Chris Walden is an e-business Architect for IBM Developer Relations Technical Consulting in Austin, Texas, providing education, enablement, and consulting to IBM Business Partners. He is the official Linux fanatic on his hallway and does his best to spread the good news to all who will hear it. In addition to his architect duties, he manages the area's all-Linux infrastructure servers, which include file, print, and other application services in a mixed-platform user environment. Chris has ten years of experience in the computer industry ranging from field support to Web application development and consulting.




