Skip to main content

Charming Python: Hatch Python eggs with setuptools

A PEAK at improved installation and package management

David Mertz, Ph.D. (mertz@gnosis.cx), Developer, Gnosis Software, Inc.
David Mertz
David Mertz has been writing the developerWorks columns Charming Python and XML Matters since 2000. Check out his book Text Processing in Python. For more on David, see his personal Web page.

Summary:  David takes a look at the setuptools framework, a side project of the Python Enterprise Application Kit (PEAK). setuptools replaces the standard distutils library and adds versioned package and dependency management to Python. Perl users will be familiar with CPAN, and Ruby users with Gems; the tool ez_setup that bootstraps setuptools and the expanded easy_install that comes with it act in conjunction with "Cheeseshop" (the Python Package Index, also called "PyPI") to achieve the same thing. Moreover, setuptools lets you package your libraries in a single-file archive called an "egg," which is a lot like a Java™ JAR file, but for Python.

View more content in this series

Date:  24 Oct 2006
Level:  Intermediate
Activity:  9084 views

The basics of the Python Enterprise Application Kit (PEAK) are covered in two previous installments of this column, "The Python Enterprise Application Kit" and "Scaling a new PEAK." In short, PEAK is a powerful framework for rapid component development and code reuse in Python.

This installment covers the setuptools framework, a PEAK side project that provides easier package management and distribution than distutils.

Getting started

The setuptools module does a really good job of "getting out of the way." For example, if you download a package that was built using setuptools rather than distutils, installation should work just as you expect: the usual dance of python setup.py install. In order to accomplish this, a package bundled using setuptools includes the small bootstrap module ez_setup.py in the archive. The only caveat here is that ez_setup.py tries to download and install the necessary setuptools package in the background -- which depends, of course, on having a networked machine. If setuptools is already installed on the local machine, this background step is not necessary; but if it needs to be installed manually, much of the transparency is lost. Still, most systems nowadays have an Internet connection; taking a few special steps for non-networked machines is not especially burdensome.

The real benefit of setuptools is not in doing roughly what distutils does -- even though it does enhance the capabilities of distutils and simplify what goes into a setup.py script. The greatest gain is setuptools' enhancement of package management capabilities. In a rather transparent way, you can find, download, and install dependencies; you can switch between multiple versions of a package, all of which are installed on the same system; you can declare requirements for specific versions of packages; and you can update to the latest versions of packages you use with a simple command. The most impressive part of all this is perhaps that fact that you can even utilize packages whose developers have done nothing whatsoever to consider setuptools compatibility.

Let's take a closer look.


Bootstrapping

The utility ez_setup.py is a simple script that bootstraps the rest of setuptools. Slightly confusingly, the easy_install script that comes with the full setuptools package does the same thing as ez_setup.py. The former assumes setuptools is already installed, however, so it skips the behind-the-scenes installation. Both versions accept the same arguments and switches.

The first step in the process is simply downloading the small script ez_setup.py:


Listing 1. Downloading the bootstrap script

% wget -q http://peak.telecommunity.com/dist/ez_setup.py

From there, you can run the script without any arguments to install the rest of setuptools (if you do not do this as a separate step, it will still get done the first time you install some other package). You should see something similar to this (depending, of course, on the version you are using):


Listing 2. Bootstrapping setuptools

% python ez_setup.py
Downloading http://cheeseshop.python.org/packages/2.4/s/
  setuptools/setuptools-0.6b1-py2.4.egg#md5=b79a8a403e4502fbb85ee3f1941735cb
Processing setuptools-0.6b1-py2.4.egg
creating /sw/lib/python2.4/site-packages/setuptools-0.6b1-py2.4.egg
Extracting setuptools-0.6b1-py2.4.egg to /sw/lib/python2.4/site-packages
Removing setuptools 0.6a11 from easy-install.pth file
Adding setuptools 0.6b1 to easy-install.pth file
Installing easy_install script to /sw/bin
Installing easy_install-2.4 script to /sw/bin

Installed /sw/lib/python2.4/site-packages/setuptools-0.6b1-py2.4.egg
Processing dependencies for setuptools

All done. That's all you need to do to make sure setuptools is installed on your system.


Installing packages

For many Python packages, all you need to do to install them is pass their name as a parameter to ez_setup.py or easy_install. Now that you've bootstrap loaded setuptools, you might as well use the internally simpler easy_install (though in practice it makes little difference which you choose).

For example, let's say you want to install the package SQLObject. This can be as simple as Listing 3. Notice in the messages that SQLObject turned out to depend on a package called FormEncode; luckily, it is all taken care of for us:


Listing 3. Installing a typical package

% easy_install SQLObject
Searching for SQLObject
Reading http://www.python.org/pypi/SQLObject/
Reading http://sqlobject.org
Best match: SQLObject 0.7.0
Downloading http://cheeseshop.python.org/packages/2.4/S/
  SQLObject/SQLObject-0.7.0-py2.4.egg#md5=71830b26083afc6ea7c53b99478e1b6a
Processing SQLObject-0.7.0-py2.4.egg
creating /sw/lib/python2.4/site-packages/SQLObject-0.7.0-py2.4.egg
Extracting SQLObject-0.7.0-py2.4.egg to /sw/lib/python2.4/site-packages
Adding SQLObject 0.7.0 to easy-install.pth file
Installing sqlobject-admin script to /sw/bin

Installed /sw/lib/python2.4/site-packages/SQLObject-0.7.0-py2.4.egg
Processing dependencies for SQLObject
Searching for FormEncode>=0.2.2
Reading http://www.python.org/pypi/FormEncode/
Reading http://formencode.org
Best match: FormEncode 0.5.1
Downloading http://cheeseshop.python.org/packages/2.4/F/
  FormEncode/FormEncode-0.5.1-py2.4.egg#md5=f8a19cbe95d0ed1b9d1759b033b7760d
Processing FormEncode-0.5.1-py2.4.egg
creating /sw/lib/python2.4/site-packages/FormEncode-0.5.1-py2.4.egg
Extracting FormEncode-0.5.1-py2.4.egg to /sw/lib/python2.4/site-packages
Adding FormEncode 0.5.1 to easy-install.pth file

Installed /sw/lib/python2.4/site-packages/FormEncode-0.5.1-py2.4.egg

As you can see from the messages, easy_install looks for metadata information about the package at www.python.org/pypi/, then finds the location for the actual download (in this case the egg archive lives right at cheeseshop.python.org; more on eggs soon).

You can do more than just install the latest version of a package, as is the default. If you like, you can give easy_install specific version requirements. Let's try to install a post-beta version of SQLObject:


Listing 4. Installing a minimum version of a package

% easy_install 'SQLObject>=1.0'
Searching for SQLObject>=1.0
Reading http://www.python.org/pypi/SQLObject/
Reading http://sqlobject.org
No local packages or download links found for SQLObject>=1.0
error: Could not find suitable distribution for
  Requirement.parse('SQLObject>=1.0')

If (as is the case at the time of this writing) the latest version of SQLObject is less than 1.0, there is nothing to install.


Installing "naive" packages

The package SQLObject is already "setuptools aware"; but what if you want to install a package whose author has not given thought to setuptools? For example, before this article, I never used setuptools with my "Gnosis Utilities." Still, let's try installing the package, knowing only the HTTP (or FTP, SVN, CVS) location where it lives (setuptools knows all these protocols). My download Web site has archives of the various Gnosis Utilities versions, named in a usual versioning fashion:


Listing 5. Installing a setuptools-unaware package

% easy_install -f http://gnosis.cx/download/Gnosis_Utils.More/ Gnosis_Utils
Searching for Gnosis-Utils
Reading http://gnosis.cx/download/Gnosis_Utils.More/
Best match: Gnosis-Utils 1.2.1
Downloading http://gnosis.cx/download/Gnosis_Utils.More/
  Gnosis_Utils-1.2.1.zip
Processing Gnosis_Utils-1.2.1.zip
Running Gnosis_Utils-1.2.1/setup.py -q bdist_egg --dist-dir
  /tmp/easy_install-CCrXEs/Gnosis_Utils-1.2.1/egg-dist-tmp-Sh4DW1
zip_safe flag not set; analyzing archive contents...
gnosis.__init__: module references __file__
gnosis.magic.__init__: module references __file__
gnosis.xml.objectify.doc.__init__: module references __file__
gnosis.xml.pickle.doc.__init__: module references __file__
gnosis.xml.pickle.test.test_zdump: module references __file__
Adding Gnosis-Utils 1.2.1 to easy-install.pth file

Installed /sw/lib/python2.4/site-packages/Gnosis_Utils-1.2.1-py2.4.egg
Processing dependencies for Gnosis-Utils

Happily for us, easy_install figured everything out. It looked in the given download directory, identified the highest available version number, unpackaged the archive, and repackaged it as an "egg" that was then installed. Importing gnosis now works fine in a script. But suppose you now need to test a script against a specific earlier version of Gnosis Utilities? Easy enough:


Listing 6. Installing a particular version of a "naive" package

% easy_install -f http://gnosis.cx/download/Gnosis_Utils.More/
  "Gnosis_Utils==1.2.0"
Searching for Gnosis-Utils==1.2.0
Reading http://gnosis.cx/download/Gnosis_Utils.More/
Best match: Gnosis-Utils 1.2.0
Downloading http://gnosis.cx/download/Gnosis_Utils.More/
  Gnosis_Utils-1.2.0.zip
[...]
Removing Gnosis-Utils 1.2.1 from easy-install.pth file
Adding Gnosis-Utils 1.2.0 to easy-install.pth file

Installed /sw/lib/python2.4/site-packages/Gnosis_Utils-1.2.0-py2.4.egg
Processing dependencies for Gnosis-Utils==1.2.0

There are actually two versions of Gnosis Utilities installed now, with 1.2.0 the active version. Switching the active version back to 1.2.1 is also easy:


Listing 7. Changing the "active" version system-wide

% easy_install "Gnosis_Utils==1.2.1"
Searching for Gnosis-Utils==1.2.1
Best match: Gnosis-Utils 1.2.1
Processing Gnosis_Utils-1.2.1-py2.4.egg
Removing Gnosis-Utils 1.2.0 from easy-install.pth file
Adding Gnosis-Utils 1.2.1 to easy-install.pth file

Using /sw/lib/python2.4/site-packages/Gnosis_Utils-1.2.1-py2.4.egg
Processing dependencies for Gnosis-Utils==1.2.1

Of course, this makes only one version active at a time. But, by putting two lines at the top of an individual script like so, you can let the script choose the version it wants to use:


Listing 8. Using a package version within a script

from pkg_resources import require
require("Gnosis_Utils==1.2.0")

With this stated requirement, setuptools will add the specific version (or the latest available, if the greater-than comparison is specified) when an import statement is run.


Making a package more aware of setuptools

I might like to let users install Gnosis Utilities without even knowing its download directory. This almost works, simply because Gnosis Utilities has an information listing at the Python Cheeseshop. Unfortunately, not having considered setuptools, I had created a slight "impedance mismatch" in my entry for Gnosis Utilities at python.org, http://www.python.org/pypi/Gnosis%20Utilities/1.2.1. Specifically, the archives are named on a pattern like Gnosis_Utils-N.N.N.tar.gz. (The utilities are also archived as .zip and .tar.bz2, and the last few versions as win32.exe installers, all of which setuptools is equally happy with). But the project name on Cheeseshop is spelled slightly differently as "Gnosis Utilities." Oh well, a quick administrative version change at Cheeseshop created http://www.python.org/pypi/Gnosis_Utils/1.2.1-a as a post-release version. Nothing was changed in the distribution archives themselves, just a little bit of metadata at Cheeseshop. With the slight tweak, we might use an even simpler install (note that for testing purposes, I ran an intervening easy_install -m to remove the installed package).


Listing 9. Easy addition of setuptools awareness

% easy_install Gnosis_Utils
Searching for Gnosis-Utils
Reading http://www.python.org/pypi/Gnosis_Utils/
Reading http://www.gnosis.cx/download/Gnosis_Utils.ANNOUNCE
Reading http://gnosis.cx/download/Gnosis_Utils.More/
Best match: Gnosis-Utils 1.2.1
Downloading [...]

I omit the completion of the process, since it's identical to what you've already seen. The only change is that easy_install looks on Cheeseshop (in other words, www.python.org/pypi/) for metadata about a package matching the name specified, and uses that to look for an actual download location. In this case, the listed .ANNOUNCE file does not contain anything helpful, but easy_install is happy to keep looking at the other listed URL as well, which proves to be a download directory.


All about eggs

An egg is a bundle that contains all the package data. In the ideal case, an egg is a zip-compressed file with all the necessary package files. But in some cases, setuptools decides (or is told by switches) that a package should not be zip-compressed. In those cases, an egg is simply an uncompressed subdirectory, but with the same contents. The single file version is handy for transporting, and saves a little bit of disk space, but an egg directory is functionally and organizationally identical. Java™ technology users who have worked with JAR files will find eggs very familiar.

You may use an egg simply by pointing PYTHONPATH or sys.path at it and importing as you normally would, thanks to the import hook changes in recent versions of Python (you need 2.3.5+ or 2.4). If you wish to take this approach, you do not need to bother with setuptools or ez_setup.py at all. For example, I put an egg for the PyYAML package in a working directory that I used for this article. I can use the package as easily as this:


Listing 10. Eggs on the PYTHONPATH

% export PYTHONPATH=~/work/dW/PyYAML-3.01-py2.4.egg
% python -c 'import yaml; print yaml.dump({"foo":"bar",1:[2,3]})'
1: [2, 3]
foo: bar

However, this sort of manipulation of the PYTHONPATH (or of sys.path within a script or Python shell session) is a bit fragile. Discovery of eggs is probably best handled within some newish magic .pth files. Any .pth files found in site-packages/ or on the PYTHONPATH are parsed for additional imports to perform, in a very similar manner to the way directories in those locations that might contain packages are examined. If you handle package management with setuptools, a file called easy-install.pth is modified when packages are installed, upgraded, removed, etc. But you may call your .pth files whatever you like (as long as they have the .pth extension). For example, here is my easy-install.pth:


Listing 11. .pth files as configuration of egg locations

% cat /sw/lib/python2.4/site-packages/easy-install.pth
import sys; sys.__plen = len(sys.path)
setuptools-0.6b1-py2.4.egg
SQLObject-0.7.0-py2.4.egg
FormEncode-0.5.1-py2.4.egg
Gnosis_Utils-1.2.1-py2.4.egg
import sys; new=sys.path[sys.__plen:]; del sys.path[sys.__plen:];
  p=getattr(sys,'__egginsert',0); sys.path[p:p]=new;
  sys.__egginsert = p+len(new)

The format is a bit peculiar: it is almost, but not quite, a Python script. Suffice it to say that you may add additional listed eggs in there; or better yet, easy_install will do it for you when it runs. You may also create as many other .pth files as you like under site-packages/, and each may simply list which eggs to make available.


Enhancing setup scripts

The above magic of installing a setuptools naive package (see Listing 6) worked only partially. That is, the package Gnosis_Utils got installed, but not quite completely. All the general functionality works, but a variety of supporting files were omitted when the egg was automatically generated -- mostly documentation files with a .txt extension and test files with .xml extensions (but also some miscellaneous README, .rnc, .rng, .xsl, and whatnot scattered around the subpackages). As it happens, all of these supporting files are "nice to have" and not strictly required. Still, we would like to include all the supporting files.

The setup.py script for Gnosis_Utils is quite complex, actually. Besides listing basic metadata, in 467 lines of code, it performs a whole bunch of testing for Python version capabilities and bugs; works around glitches in old versions of distutils; falls back to skipping installation of non-supported parts (for example, if pyexpat is not included in your Python distribution); handles OS line-ending convention conversion; creates multiple archive/installer types; and rebuilds the MANIFEST file in response to these tests. The capability to do all this work is mostly thanks to the package co-maintainer, Frank McIngvale; and it lets Gnosis_Utils successfully install as far back as Python 1.5.1, if necessary (with reduced capabilities in earlier versions). The quick moral here is that what I am about to show you does not do as much as the complex distutils script: it simply assumes that a "normal"-looking and recent version of Python is installed. That said, it is still impressive just how easy setuptools can make an installation script.

As a first try, let's create a setup.py script borrowing from the setuptools manual, and try creating an egg using it:


Listing 12. setuptools setup.py script

% cat setup.py
from setuptools import setup, find_packages
setup(
    name = "Gnosis_Utils",
    version = "1.2.2",
    packages = find_packages(),
)
% python setup.py -q bdist_egg
zip_safe flag not set; analyzing archive contents...
gnosis.__init__: module references __file__
gnosis.doc.__init__: module references __file__
gnosis.magic.__init__: module references __file__
gnosis.xml.objectify.doc.__init__: module references __file__
gnosis.xml.pickle.doc.__init__: module references __file__
gnosis.xml.pickle.test.test_zdump: module references __file__

This little effort works; or at least it sort of works. It really does create an egg with these few lines, but the egg has the same shortcoming as the version easy_install created: it lacks the support files that are not named .py. So let's try again, only a little harder:


Listing 13. Adding the missing package_data

from setuptools import setup, find_packages
setup(
    name = "Gnosis_Utils",
    version = "1.2.2",
    package_data = {'':['*.*']},
    packages = find_packages(),
)

It turns out that is all you need to do. Of course, in practice you'll often want to fine tune this a bit. For example, more realistically, this might list the following:


Listing 14. Packaging specific file types

package_data = {'doc':['*.txt'], 'xml':['*.xml', 'relax/*.rnc']}

Which translates as, include the .txt files under the doc/ subpackage, all the .xml files under the xml/ subpackage, and all the .rnc files under the xml/relax/ subpackage.


Conclusion

I really just scratched the surface of the customization you can perform with setuptools-aware distributions. For example, once you have a distribution (either in the preferred egg format or another archive type), you can automatically upload the archive and metadata to Cheeseshop with a single command. Obviously, a complete setup.py script should contain the same detailed metadata that your old distutils scripts contained; I skipped that for ease of presentation, but the argument names are compatible with distutils.

It takes a little while to get fully comfortable with setuptools' large set of capabilities, but it really makes both maintaining your own packages and installing outside packages much easier than the distutils baseline. And if all you care about is installing packages, pretty much everything you need to know is contained in this introduction; the complexity only comes with describing your own packages, and that complexity is still less than required to grok distutils.


Resources

Learn

Get products and technologies

  • At the Python Cheese Shop, get the latest version of setuptools.

  • Gnosis Utilities, David's handy set of Python libraries, are available from the Cheese Shop.

  • Order the SEK for Linux, a two-DVD set containing the latest IBM trial software for Linux from DB2®, Lotus®, Rational®, Tivoli®, and WebSphere®.

  • With IBM trial software, available for download directly from developerWorks, build your next development project on Linux.

Discuss

About the author

David Mertz

David Mertz has been writing the developerWorks columns Charming Python and XML Matters since 2000. Check out his book Text Processing in Python. For more on David, see his personal Web page.

Comments (Undergoing maintenance)



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Linux, Open source
ArticleID=170278
ArticleTitle=Charming Python: Hatch Python eggs with setuptools
publish-date=10242006
author1-email=mertz@gnosis.cx
author1-email-cc=tomyoung@us.ibm.com

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Rate a product. Write a review.

Special offers