Windows-to-Linux roadmap: Part 9. Installing software

Using pre-compiled RPMs and compiling applications from source

IBM e-business architect Chris Walden is your guide through a nine-part developerWorks series on moving your operational skills from a Windows to a Linux environment. He covers everything from logging to networking, and from the command-line to help systems -- even compiling packages from available source code. In this final part, we download and compile a software package, discuss the pros and cons of automated package management, and get to know the RPM system.

Chris Walden, e-business Architect, IBM

Chris Walden is an e-business Architect for IBM Developer Relations Technical Consulting in Austin, Texas, providing education, enablement, and consulting to IBM Business Partners. He is the official Linux fanatic on his hallway and does his best to spread the good news to all who will hear it. In addition to his architect duties, he manages the area's all-Linux infrastructure servers, which include file, print, and other application services in a mixed-platform user environment. Chris has ten years of experience in the computer industry ranging from field support to Web application development and consulting.



11 November 2003

Also available in Russian

One of the first things you notice when you install Linux is that there are so many packages available with your distribution. Most distributions come with the Linux operating system, installation tools, and administration tools. Then they include Internet tools, development tools, office tools, games, and some things that you haven't even heard of. It is not uncommon for a Linux distribution to come with thousands of available packages. If you didn't select "install everything," then some subset of these packages were installed.

Now you may be wondering "How do I remove packages I don't want? How do I install things I missed? Can I use software that didn't come with my distribution?"

RPMs

As Linux installed, you probably noticed a lot of information about RPMs being installed. RPM stands for Redhat Package Manager, a contribution by Redhat that has become a standard for managing software on Redhat and UnitedLinux as well as on many other distributions.

Essentially, an RPM is a package, containing software for Linux ready to install and run on a particular machine architecture. For example, we installed the webmin package from an RPM in "Part 3. Introduction to Webmin." All of the software initially loaded in your distribution was installed from an RPM.

Anatomy of an RPM

An RPM is a package of files. It includes a .spec file, which provides information about the package, its function, and its dependencies (what packages must be in place before it can run). The .spec also contains a manifest of files in the package, where they must be loaded on the system, and what their initial permissions will be. The RPM also contains a pre-install script, which is written by the package developer. Then the RPM contains the compiled binary files. Finally, the RPM contains a post-install script.

RPM layout

  • .spec
  • pre-install script
  • binary file
  • binary file
  • ...
  • binary file
  • post-install script

When an RPM is installed, the system first looks to see if the dependencies for the package are satisfied. If not, then the installation terminates unless you specify options to force an install anyway.

If all is clear, the pre-install script runs. This script can do anything. Normally it creates users and directories. However, it can do many types of dynamic configuration, even custom-compile source code for the running system.

Know where your RPMs have been

When RPMs install, they copy files onto your system and execute scripts. Since RPM is run as root, all of these functions are performed as root. It is therefore important that you know the origin of an RPM before you install it on your system. Just as with Windows software, malicious code can be contained inside an RPM as easily as any other package. RPMs from the manufacturer are generally safe, but be cautious about randomly downloading and installing things from unknown sources.

If the pre-install script completes successfully, then the binary files are copied onto the system according to the manifest. Once all of the files have been copied and their permissions are set, then the post-install script is run. Again, this script can do almost anything.

Once all of that is completed, the information about the package is added to the RPM database, and the installation is complete. With this simple system, it is possible to perform all of the functions that could be done with a more elaborate commercial installer.

The RPM database

The piece of the RPM that adds elegance is the RPM database. This database typically lives in the /var/lib/rpm directory and holds information about every RPM installed on the system. The database knows the dependency relationships between packages and will warn if removing a package could cause other packages to break. The database knows about every file that was originally installed with a package and its original state on the system. It also knows the locations of the documentation and configuration files for each package. This may sound like a lot of information, and it is. But it isn't bloated and bulky. On a system containing 1,066 packages, comprised of 203,272 files, the database files are only 45 MB! RPM uses the database to check dependencies when packages are loaded and unloaded. Users can also query the database for information on packages.


Using RPM

The program to work with RPM packages is appropriately named rpm. rpm runs in several different modes, but the most common tasks are install, upgrade, query, verify, and erase.

rpm -i (install)

When you install a package for the first time, you will use the -i or install mode. Simply point the rpm to a binary package and execute it. The rpm will be installed on your system. Installation normally takes seconds. Often when installing a package, I will add the -v (verbose) switch to provide more information about the process, and the -h (hash bar) switch to provide progress updates via hash (#) marks printed on the console as the package is installed. Here's an example of installing a package:

Listing 1. Installing MyPackage
$ rpm -ivh MyPackage-1.0.0.i386.rpm
Preparing...                ########################################### [100%]
   1:MyPackage              ########################################### [100%]

That's it! MyPackage is now installed and ready to use.

rpm must be run as root

rpm installs and erases must be done as root, because access is required to the file system and the rpm databases.

rpm -e (erase)

To remove an installed package, use the -e switch to erase it. rpm will use the database to remove all of the files for the package. If there are other packages installed that depend on the one you are removing, rpm will abort. You will have to force the erase with the nodeps switch. (nodeps can also be used to force an installation.) Be very careful when using this switch to force an install or erase. Removing packages that others are dependent on can have unfortunate results. Here is the command to remove the package we installed above:

$ rpm -e MyPackage

Notice that the full version of the package is not necessary to remove it. The full name was required at installation because we were pointing to a file name. Installed packages are referenced by their name only. The package's name is everything up to the version number.

rpm -V (verify)

The verify switch is very useful. It compares the current state of a package's files to their original state upon installation. Differences are shown using a code:

Results of verifying files

CodeMeaning
SFile Size differs
MMode differs (includes permissions and file type)
5MD5 sum differs
DDevice major/minor number mis-match
LreadLink(2) path mis-match
UUser ownership differs
GGroup ownership differs
TmTime differs

If you were to run rpm -V on a package and discover that the size had changed for an executable, that would be a possible sign of a security breach.

rpm -U (upgrade)

Once a package has been installed, any attempt to install a package with the same name will result in a message that the package is already installed. If you want to update a package to a later version, use the -U switch to upgrade. Upgrade has another affect. When upgrade is run on multiple package names, it will try to put the packages in order of dependencies. In other words, required packages will be installed first. The upgrade switch can be used whether or not a package is already installed, so many people use it for installs as well as upgrades instead of using the -i switch. Here is an example of using the upgrade switch to load several rpm packages:

Listing 2. Interactive upgrade
$ rpm -Uvh My*.rpm
Preparing...                ########################################### [100%]
   1:bMyPackageDep          ########################################### [ 50%]
   1:aMyPackageNew          ########################################### [100%]

In the case above, bMyPackageDep was a prerequisite for aMyPackageNew, so even though the file names sorted in reverse order, rpm ordered them correctly.

rpm -q (query)

Several pieces of useful information can be queried from the rpm database. Queries can be run by any user who has read access to the rpm database. By default, all users have read access. To run a query, use the -q switch with the name of the package to query. This will return the version of the package.

$ rpm -q MyPackage
MyPackage-1.0.0

The name of the package must be exactly correct. Wild cards are not allowed. However, if you cannot remember the full name of a package, you can use the grep tool to help find it. Use the -qa switch to query all installed packages and pipe the information through grep with the text you can remember. For example:

The joy of grep

grep is a text search tool that has a wide variety of uses. By default, grep will search files to show you lines that contain the text you indicate. In our example, we searched for "IBM." grep is a powerful tool in your scripting and console work.

$ rpm -qa | grep IBM
IBMWSAppDev-Product-5.0-0
IBMWSSiteDevExp-Core-5.0-0
IBMWSSiteDev-Core-5.0-0
IBMWSTools-WAS-BASE-V5-5.0-0
IBMJava118-SDK-1.1.8-5.0
IBMWSWB-samples-5.0-0
IBMWSWB-5.0-0
IBMWSAppDev-Core-5.0-0
IBMWSAppDev-5.0-0
IBMWSTools-5.0-0

Besides version numbers, rpm -q can provide other useful information about a package. Here are some examples:

Getting information with an rpm query

Query switchInformation
rpm -q changelogShows the development change history for the package
rpm -qcShows the configuration files for the package
rpm -qdShows the documentation files for the package
rpm -qiShows the package description
rpm -qlShows a list of the package's files
rpm -qRShows the dependencies for the package

The query also has another interesting command which is run on files rather than packages.

rpm -q whatprovides <filename>

The above command will identify the package that is associated with the filename given. The filename must include the absolute path to the file, since that is how the information is stored in the rpm database.


RPM front ends

Working with rpm from the console is easy, but sometimes it is more convenient to work with a graphical interface. In typical Linux style, there are front-end programs which provide an interface to the rpm program. Each distribution has a front end, which will vary. Consult your distribution documentation for information about the package management tool provided.


Webmin software packages

Webmin also provides a simple Web-based front end for dealing with RPM packages.

Figure 1. Webmin RPM interface
Figure 1. Webmin RPM interface

Software can be easily installed erased and queried from here. Software can also be installed directly from URL sites. If you have rpm enhancement tools installed such as apt or the Redhat Network, Webmin will pick them up and provide an interface to them.


Source code

Since Linux is an open source operating system, it comes with all of the development tools required to compile software. While most of the packages that you work with will be provided as binary RPMs, you are not limited to only those packages. If you wish, you can download the raw source code and custom-compile it for your system.

You should be cautious about compiling from source on a production system as it may cause problems or void your support for commercial software which you are using on the system, such as IBM DB2. However, being familiar with compiling from source will allow you to apply patches to software and work with packages ported from other environments. Once you have compiled the code successfully, it is even possible to create your own RPM!


Corewars source demonstration

To demonstrate how simple it can be to compile from source, we will compile a simulation game called Corewars (see Resources for a link). Here's a note about Corewars from their Web site: "Corewars is a simulation game where a number of warriors try to crash each other while they are running in a virtual computer. The warriors can be written in one of two assembler-like languages called Corewars and Redcode. Corewars is the default language and is easier to learn and understand. Redcode provides more advanced and powerful instructions but also requires more time to learn."

The first step to compiling from source is to download the source code package from the Web site:

Once the code is downloaded, I expand the package.

tar -xvzf corewars-0.9.13.tar.gz

The file is expanded into my current directory. The standard approach is for the source code to be contained in a directory which matches the product name. In this case, it's in a directory called corewars-0.9.13.

I enter into that directory and find the source code, some documentation, configure scripts and README files. Most source packages will come with a file called INSTALL and one called README. You should read these materials before you compile the software. They will usually save you a lot of pain by identifying problems before you have them and advising you of the correct procedures for compiling and installation. Most problems I have had compiling from source were simply because I didn't follow the directions.

The most common next step is to run the configure script. configure is part of the Autoconf package, included with the development tools of your Linux distribution. Quoting Autoconf's package description: "GNU's Autoconf is a tool for configuring source code and Makefiles. Using Autoconf, programmers can create portable and configurable packages, since the person building the package is allowed to specify various configuration options."

The configure script runs a series of tests on the system to determine the best way to compile the package for your distribution and architecture. It then createx a custom Makefile for your system. If there are problems with compiling on your system, configure tells you. configure will usually let you customize the features to be included in the compile, or let you provide parameters about locations of libraries or other needed files so that the package can be compiled successfully. Here we execute configure with no additional parameters:

./configure

Several tests run on the system ultimately end with success. Now build the program using:

make

If the compile has errors, I will need to determine the problems and fix them. This can be non-trivial and may require a good deal of knowledge about your environment and programming in general. If all goes well, we typically install the software with:

make install

The files are copied into the correct areas of the system, file permissions are updated, configuration files are copied and documentation is added to the manual pages.

Now let's test our handiwork by executing the program. It is a graphical program, so you will need to have X running when you start it. The make install which we did above should put the program in our executable path.

corewars

A graphical screen should appear to reward us.

Figure 2. Success!
Figure 2. Corewars game

The topic of corewars rules is outside of the scope of this article, but you will find documentation about this interesting simulation game in the man page (man corewars).

The corewars compile was a typical scenario. There are many possible variations including using switches on the configure script to adjust the features that are compiled into the program, using different commands from the Makefile to adjust how the compile is done, and others.

Since this program was not installed using rpm, there are no entries in the rpm database. If a program doesn't work out after it's been installed, most Makefiles include an uninstall parameter to remove the software:

make uninstall

Bear in mind that working with raw source code does not enter anything into the RPM database. Software installed in this way is unmanaged, so it should be done with care.

Source RPMs

When an RPM is created, there is an artifact called a Source RPM. This is a SPEC file combined with source code designed to build on one or more architectures. This is the best of both worlds! With a source RPM, you can custom compile the software on your system, but the finished product will be an installable RPM rather than the raw binaries. Most packages that are available as a pre-compiled RPM are also available as a SRPM. This can be a simple way to move software across platforms in Linux. When you have success recompiling onto a different platform, consider sharing your finished RPMs with the community.


May the source be with you

If you are new to Linux, installing software has a different approach than what you are used to. However, the RPM approach to installation is elegant and provides new power which you will soon grow to appreciate.

You should become familiar with the options for working with rpms from the console, but for daily use there are front-end interface options to make rpms easier to manage. One was provided by your distribution, and others are available, such as the one in Webmin.

You are not limited by pre-compiled rpms, though. You can take advantage of the open source nature of Linux to compile applications directly from the source code. Compiling is generally easy for a mature project. Remember that code installed from source code will not have an entry in your rpm database. When working with source, consider using source rpms, which combine the power of compiled source with the manageability of rpms.

Resources

  • Check out the other parts in the Windows-to-Linux roadmap series (developerWorks, November 2003).
  • The IBM developerWorks tutorials LPI certification exam prep are very useful. Part 1 covers Red Hat and Debian package managers as well ascompiling programs from sources and managing shared libraries. Part 2 covers important kernel configuration and in-kernel PCI and USB support.
  • Learn how to create your own packages with the IBM developerWorks feature Packaging software with RPM.
  • Regularly upgrading software is an important part of security. Learn how to keep your system up to date quickly and easily with the IBM developerWorks Tip on Upgrading applications from sources.
  • SourceForge is an excellent resource for open source projects; you can add your own projects as well.
  • The Corewars project is just one of many posted at SourceForge.
  • The developerWorks Open source zone hosts open-source projects at IBM.

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Linux on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Linux
ArticleID=11357
ArticleTitle=Windows-to-Linux roadmap: Part 9. Installing software
publish-date=11112003