Front and center of a successful open source project is the packaging. A key ingredient to good packaging is versioning. Because the project is open source, you will want to publish your package to realize the many benefits the open source community offers. Different platforms and languages have different mechanisms for packaging, but this article focuses specifically on Python and its packaging ecosystem. The article discusses packaging mechanics to give you a foundation on which to grow and provides enough practical examples to get you started immediately.
Why worry about packaging?
Beyond just being the right thing to do, there are three practical reasons to package your software:
- Ease of use
- Stability (with versioning)
It's an act of consideration to your users to make your application as
effortless as possible to install. Packaging makes your software more
accessible and easier to install. If it's easier to install, it will be
easier for users to start using your software. By publishing your package on
the Python Package Index (PyPI), you'll make it easily accessible through
easy_install. (See Resources for links to more information on these tools.)
In addition, by versioning your packages, you enable your users to "pin" the dependency in their project on your software to a particular version. For example, pinning Pinax to the 0.9a2.dev1017 version would be expressed as:
This would enforce that the project used the 0.9a2.dev1017 release of Pinax.
Versioning ensures greater stability should you release changes to your software later that might have breaking interfaces. It allows your users to know exactly what they are getting and makes it easier for them to track differences in releases. Furthermore, project developers can know exactly what they are coding against.
A common method for publishing packages to PyPI (or your own distribution server) is to create a source distribution to upload. A source distribution is a standard way of packaging the source of your project as a distributable unit. There are ways to create binary distributions, but for the sake of open source, it makes sense also to distribute your source. Creating source distributions makes it easy for people to use tools that will look up the software on the Internet, download it, and install it all automatically. This process helps not only with local development but also with deployments of your software.
So, by making it easier for users to integrate and install your software, using good versioning that allows a reliable pinning technique, and then publishing your package for greater distribution, you will have a greater chance of your project being successful and gaining wider adoption. Wider adoption may lead to more contributors—something every open source developer surely desires.
Anatomy of a setup.py file
One of the purposes of the setup.py script is to serve as the executable you can run to package your software and upload it to distribution servers. The setup.py script can vary quite a bit in content as you browse around popular Python repositories. This article focuses on the basics. See the Resources section to explore further on your own.
You can use the setup.py file for many different tasks, but here you create one that will enable you to run the following commands:
python setup.py register python setup.py sdist upload
The first command,
register, takes the
information supplied in the
within the setup.py script and creates an entry on PyPI for your package.
It won't upload anything; rather, it creates the metadata about your
project so that you can subsequently upload and host releases there. The
next two commands are chained together:
sdist upload builds a source distribution, and
then uploads it to PyPI. There are a few prerequisites, however, such as
setting up your .pypirc configuration file and actually writing the
contents of setup.py.
First, configure your .pypirc file. This should reside in your home
directory, which will vary depending on your operating system. On
UNIX®, Linux®, and Mac OS X, you can get there by typing
cd ~/. The contents of the file should contain
your PyPI credentials, as shown in Listing 1.
Listing 1. A typical .pypirc file
[distutils] index-servers = pypi [pypi] username:xxxxxxxxxxxxx password:xxxxxxxxxxxxx
Next, go to PyPI and register for an account (don't worry: it's free). Put the same user name and password you created on PyPI in your .pypirc file, and make sure the file is named ~/.pypirc.
Now, in writing your setup.py script, you have to decide what you want displayed on the PyPI index page, and what do you want to name your project. Start by copying a template for setup.py that I use for projects (see Listing 2). Skipping the imports and functions, look at the bottom of the template and what you need to change to suit your project. See Resources for a link to the full script.
Listing 2. The setup.py template
PACKAGE = "" NAME = "" DESCRIPTION = "" AUTHOR = "" AUTHOR_EMAIL = "" URL = "" VERSION = __import__(PACKAGE).__version__ setup( name=NAME, version=VERSION, description=DESCRIPTION, long_description=read("README.rst"), author=AUTHOR, author_email=AUTHOR_EMAIL, license="BSD", url=URL, packages=find_packages(exclude=["tests.*", "tests"]), package_data=find_package_data( PACKAGE, only_in_packages=False ), classifiers=[ "Development Status :: 3 - Alpha", "Environment :: Web Environment", "Intended Audience :: Developers", "License :: OSI Approved :: BSD License", "Operating System :: OS Independent", "Programming Language :: Python", "Framework :: Django", ], zip_safe=False, )
First, notice that this template expects your project to have two different
files. The first is used for the
long_description: It reads the contents of the
README.rst file that's in the same directory as setup.py and passes the
contents as a string to the
parameter. This file populates the landing page on PyPI, so it's a good
idea to briefly describe the project and show examples of usage in this
file. The second file is the package's __init__.py file. It's not
explicitly mentioned here, but the line that sets the VERSION variable
imports your package; and when it does, Python needs an __init__.py file
and expects a variable defined in that module called
__version__. For now, just set it as a string:
# __init__.py __version__ = "0.1"
Now, let's look at the rest of the inputs:
- PACKAGE is the Python package in your project. It's the
top-level folder containing the __init__.py module that should be in
the same directory as your setup.py file—for example:
/- |- README.rst |- setup.py |- dogs |- __init__.py |- catcher.py
dogswould be your package here.
- NAME is usually similar to or the same as your PACKAGE name
but can be whatever you want. The NAME is what people will refer to
your software as, the name under which your software is listed in PyPI
and—more importantly—under which users will install it
pip install NAME).
- DESCRIPTION is just a short description of your project. A sentence will suffice.
- AUTHOR and AUTHOR_EMAIL are what they sound like: your name and email address. This information is optional, but it's good practice to supply an email address if people want to reach you about the project.
- URL is the URL for the project. This URL may be a project website, the Github repository, or whatever URL you want. Again, this information is optional.
You may want to provide the license and classifiers also. For more information on creating a setup.py file, check out the Python documentation. (See Resources.)
Versioning is easily be a topic unto itself, but it is worth mentioning in the context of packaging, as good packaging involves proper versioning. Versioning is a form of communication with your users: It allows your users to build more stability and reliability into their products, as well. Through versioning, you are telling your users that you have changed something and are giving explicit boundaries for where those changes occurred.
You can find a standard for versioning Python packages in Python Enhancement Proposal (PEP) 386. (See Resources.) It spells out rules that are pragmatic. Even if you don't read and understand or even agree with the PEP, it would be wise to follow it, as it's what more and more Python developers are used to seeing.
In addition, versioning is not just for stable releases that you upload to
PyPI but is also useful for development releases using the devNN
suffix. It's not typically good to upload these dev versions to PyPI, but
you can still make them publicly available by setting up your own public
(or private) distribution server; then, users who want to use the
bleeding-edge version can reference that in their
pip requirements.txt file. Here are a few
examples of versioning:
1.0.1 # 1.0.1 final release 1.0.2a # 1.0.2 Alpha (for Alpha, after Dev releases) 1.0.2a.dev5 # 1.0.2 Alpha, Dev release #5
People are not generally going to find and install your software without it
being published. Most of the time, you will want to publish your packages
on PyPI. After you set up your .pypirc configuration file, the
upload command you pass to setup.py transmits
you package to PyPI. Typically, you do so in conjunction with building a
python setup.py sdist upload
If you are using your own distribution server, add a section for authorization in your .pypirc file for this new location, and refer to it by name when uploading:
python setup.py sdist upload -r mydist
Set up your own distribution server
The primary reason for using your own distribution server in open source is to provide a place to publish dev releases, as PyPI should really just consist of stable releases. For example, you probably want:
pip install MyPackage
... to install the latest stable release found on PyPI. However, if you add later dev releases, that command will end up installing the latest release period, which means your dev release. It's generally good always to pin a release, but not all users will do this. Therefore, ensure that not specifying a version number always returns the latest stable release.
One way to have your cake (only expose stable releases for default use of
pip) and to eat it too (enable users to install
packaged dev releases) is to host your own distribution server. The Pinax
project does this for all its dev releases at
http://dist.pinaxproject.com. (See Resources.)
The distribution server is just an index served up over Hypertext Transfer Protocol (HTTP) of files on your server. It should have the following file structure:
You can then make the server private if you so desire by configuring
Basic-Auth on your web server. You may want to
add some facility to upload source distributions, as well. To do so, you
need to add code to handle the upload, parse the file name, and create the
directory paths to match the scheme above. This structure is in place for
the Pinax project, which hosts several repositories.
pip and virtualenv
Although this article has focused primarily on packaging, this section describes consuming packages, providing a bit of appreciation for what good packaging and versioning give your users.
pip is a tool that you can install directly, but
I recommend using it as part of
Resources.) I recommend using
virtualenv for everything related to Python, as
it keeps your Python environments clean. Much as a virtual machine can
allow you to run multiple operating systems side by side,
virtualenvs allow you to run multiple Python
environments side by side. I don't install anything in my system Python
but rather create a new
virtualenv for each new
project or utility I work on.
Now that you have installed
virtualenv, you can
play for a moment:
$ mkvirtualenv —no-site-packages testing $ pip install Pinax $ pip freeze|grep Pinax $ pip uninstall Pinax $ pip install —extra-index-url=http://dist.pinaxproject.com/fresh-start/ Pinax==0.9a2.dev1017 $ pip freeze|grep Pinax
Notice that the first
downloaded and installed from PyPI.
shows all versions of packages installed in your current
pip uninstall does exactly what you think it
does: remove itself from the
you install a dev version from the fresh-start repository at
http://dist.pinaxproject.com to get the development version of Pinax
No going to websites, downloading tarballs, and symlinking code to a site-package. (That is how I used to do it, and it caused many problems.) Your users get all this as a result of good packaging, publishing, and versioning of your project.
The bottom line is that it's well worth your while to spend some time
learning the art and science of packaging. You'll gain greater adoption by
users because of the ease of installation and stability that versioning
your packages gives them. Using the template setup.py provided in Resources and covered in this article, you
should be able to add packaging to your project quickly and easily.
Communicating to your users through proper versioning is considerate of
your users, making it easy for them to track changes from release to
release. Finally, as
virtualenv gain wider adoption, reliance on
published packages—whether on PyPI or on your own distribution
servers—increases. Therefore, make sure you publish the projects
that you want to share with the world.
I hope that this article has provided enough to get you started. The Resources should provide documentation to help you dive deeper. If you have questions, don't hesitate to hop onto Freenode and find me in such chat rooms as #pinax and #django-social (with the nickname "paltman") or on Twitter (@paltman).
- Check out PyPI.
- With setuptools 0.6c11, learn how to download, build, install, upgrade, and uninstall Python packages easily.
- Check out PEP 0386 for more information about changing the version comparison module in distributions.
- Learn more about distributing Python modules.
- The Open Source developerWorks zone provides a wealth of information on open source tools and using open source technologies. See other Python-related articles we've published.
- Tune in developerWorks podcasts and listen to interesting interviews and discussions for software developers.
- Stay current through our Technical events and webcasts.
- Check out upcoming conferences, trade shows, webcasts, and other Events around the world that are of interest to IBM open source developers.
- Watch and learn about IBM and open source technologies and product functions with the no-cost developerWorks demos.
Get products and technologies
- Learn more about
- Learn more about
- Find the setup.py template used in this article.
- Explore Pinax, an open source platform built on top of Django.
- Innovate your next open source development project with IBM trial software.