A guide to Python packaging
Front and center of a successful open source project is the packaging. A key ingredient to good packaging is versioning. Because the project is open source, you will want to publish your package to realize the many benefits the open source community offers. Different platforms and languages have different mechanisms for packaging, but this article focuses specifically on Python and its packaging ecosystem. The article discusses packaging mechanics to give you a foundation on which to grow and provides enough practical examples to get you started immediately.
Why worry about packaging?
Beyond just being the right thing to do, there are three practical reasons to package your software:
- Ease of use
- Stability (with versioning)
It's an act of consideration to your users to make your application as
effortless as possible to install. Packaging makes your software more
accessible and easier to install. If it's easier to install, it will be
easier for users to start using your software. By publishing your package
on the Python Package Index (PyPI), you'll make it easily accessible
through utilities like
Related topics for links to more
information on these tools.)
In addition, by versioning your packages, you enable your users to "pin" the dependency in their project on your software to a particular version. For example, pinning Pinax to the 0.9a2.dev1017 version would be expressed as:
This would enforce that the project used the 0.9a2.dev1017 release of Pinax.
Versioning ensures greater stability should you release changes to your software later that might have breaking interfaces. It allows your users to know exactly what they are getting and makes it easier for them to track differences in releases. Furthermore, project developers can know exactly what they are coding against.
A common method for publishing packages to PyPI (or your own distribution server) is to create a source distribution to upload. A source distribution is a standard way of packaging the source of your project as a distributable unit. There are ways to create binary distributions, but for the sake of open source, it makes sense also to distribute your source. Creating source distributions makes it easy for people to use tools that will look up the software on the Internet, download it, and install it all automatically. This process helps not only with local development but also with deployments of your software.
So, by making it easier for users to integrate and install your software, using good versioning that allows a reliable pinning technique, and then publishing your package for greater distribution, you will have a greater chance of your project being successful and gaining wider adoption. Wider adoption may lead to more contributors—something every open source developer surely desires.
Anatomy of a setup.py file
One of the purposes of the setup.py script is to serve as the executable you can run to package your software and upload it to distribution servers. The setup.py script can vary quite a bit in content as you browse around popular Python repositories. This article focuses on the basics. See the Related topics section to explore further on your own.
You can use the setup.py file for many different tasks, but here you create one that will enable you to run the following commands:
python setup.py register python setup.py sdist upload
The first command,
register, takes the information supplied in
setup() function within the setup.py script and creates
an entry on PyPI for your package. It won't upload anything; rather, it
creates the metadata about your project so that you can subsequently
upload and host releases there. The next two commands are chained
sdist upload builds a source distribution, and then
uploads it to PyPI. There are a few prerequisites, however, such as
setting up your .pypirc configuration file and actually writing the
contents of setup.py.
First, configure your .pypirc file. This should reside in your home
directory, which will vary depending on your operating system. On
UNIX®, Linux®, and Mac OS X, you can get there by typing
cd ~/. The contents of the file should contain your PyPI
credentials, as shown in Listing 1.
Listing 1. A typical .pypirc file
[distutils] index-servers = pypi [pypi] username:xxxxxxxxxxxxx password:xxxxxxxxxxxxx
Next, go to PyPI and register for an account (don't worry: it's free). Put the same user name and password you created on PyPI in your .pypirc file, and make sure the file is named ~/.pypirc.
Now, in writing your setup.py script, you have to decide what you want displayed on the PyPI index page, and what do you want to name your project. Start by copying a template for setup.py that I use for projects (see Listing 2). Skipping the imports and functions, look at the bottom of the template and what you need to change to suit your project. See Related topics for a link to the full script.
Listing 2. The setup.py template
PACKAGE = "" NAME = "" DESCRIPTION = "" AUTHOR = "" AUTHOR_EMAIL = "" URL = "" VERSION = __import__(PACKAGE).__version__ setup( name=NAME, version=VERSION, description=DESCRIPTION, long_description=read("README.rst"), author=AUTHOR, author_email=AUTHOR_EMAIL, license="BSD", url=URL, packages=find_packages(exclude=["tests.*", "tests"]), package_data=find_package_data( PACKAGE, only_in_packages=False ), classifiers=[ "Development Status :: 3 - Alpha", "Environment :: Web Environment", "Intended Audience :: Developers", "License :: OSI Approved :: BSD License", "Operating System :: OS Independent", "Programming Language :: Python", "Framework :: Django", ], zip_safe=False, )
First, notice that this template expects your project to have two different
files. The first is used for the
long_description: It reads
the contents of the README.rst file that's in the same directory as
setup.py and passes the contents as a string to the
long_description parameter. This file populates the landing
page on PyPI, so it's a good idea to briefly describe the project and show
examples of usage in this file. The second file is the package's
__init__.py file. It's not explicitly mentioned here, but the
line that sets the VERSION variable imports your package; and when it
does, Python needs an __init__.py file and expects a variable defined in
that module called
__version__. For now, just set it as a
# __init__.py __version__ = "0.1"
Now, let's look at the rest of the inputs:
- PACKAGE is the Python package in your project. It's the
top-level folder containing the __init__.py module that should be in
the same directory as your setup.py file—for example:
/- |- README.rst |- setup.py |- dogs |- __init__.py |- catcher.py
dogswould be your package here.
- NAME is usually similar to or the same as your PACKAGE name
but can be whatever you want. The NAME is what people will refer to
your software as, the name under which your software is listed in PyPI
and—more importantly—under which users will install it
pip install NAME).
- DESCRIPTION is just a short description of your project. A sentence will suffice.
- AUTHOR and AUTHOR_EMAIL are what they sound like: your name and email address. This information is optional, but it's good practice to supply an email address if people want to reach you about the project.
- URL is the URL for the project. This URL may be a project website, the Github repository, or whatever URL you want. Again, this information is optional.
You may want to provide the license and classifiers also. For more information on creating a setup.py file, check out the Python documentation. (See Related topics.)
Versioning is easily be a topic unto itself, but it is worth mentioning in the context of packaging, as good packaging involves proper versioning. Versioning is a form of communication with your users: It allows your users to build more stability and reliability into their products, as well. Through versioning, you are telling your users that you have changed something and are giving explicit boundaries for where those changes occurred.
You can find a standard for versioning Python packages in Python Enhancement Proposal (PEP) 386. (See Related topics.) It spells out rules that are pragmatic. Even if you don't read and understand or even agree with the PEP, it would be wise to follow it, as it's what more and more Python developers are used to seeing.
In addition, versioning is not just for stable releases that you upload to
PyPI but is also useful for development releases using the devNN
suffix. It's not typically good to upload these dev versions to PyPI, but
you can still make them publicly available by setting up your own public
(or private) distribution server; then, users who want to use the
bleeding-edge version can reference that in their
requirements.txt file. Here are a few examples of versioning:
1.0.1 # 1.0.1 final release 1.0.2a # 1.0.2 Alpha (for Alpha, after Dev releases) 1.0.2a.dev5 # 1.0.2 Alpha, Dev release #5
People are not generally going to find and install your software without it
being published. Most of the time, you will want to publish your packages
on PyPI. After you set up your .pypirc configuration file, the
upload command you pass to setup.py transmits you package to
PyPI. Typically, you do so in conjunction with building a source
python setup.py sdist upload
If you are using your own distribution server, add a section for authorization in your .pypirc file for this new location, and refer to it by name when uploading:
python setup.py sdist upload -r mydist
Set up your own distribution server
The primary reason for using your own distribution server in open source is to provide a place to publish dev releases, as PyPI should really just consist of stable releases. For example, you probably want:
pip install MyPackage
... to install the latest stable release found on PyPI. However, if you add later dev releases, that command will end up installing the latest release period, which means your dev release. It's generally good always to pin a release, but not all users will do this. Therefore, ensure that not specifying a version number always returns the latest stable release.
One way to have your cake (only expose stable releases for default use of
pip) and to eat it too (enable users to install packaged dev
releases) is to host your own distribution server. The Pinax project does
this for all its dev releases at http://dist.pinaxproject.com. (See Related topics.)
The distribution server is just an index served up over Hypertext Transfer Protocol (HTTP) of files on your server. It should have the following file structure:
You can then make the server private if you so desire by configuring
Basic-Auth on your web server. You may want to add some
facility to upload source distributions, as well. To do so, you need to
add code to handle the upload, parse the file name, and create the
directory paths to match the scheme above. This structure is in place for
the Pinax project, which hosts several repositories.
pip and virtualenv
Although this article has focused primarily on packaging, this section describes consuming packages, providing a bit of appreciation for what good packaging and versioning give your users.
pip is a tool that you can install directly, but I recommend
using it as part of
virtualenv. (See Related topics.) I recommend using
virtualenv for everything related to Python, as it keeps your
Python environments clean. Much as a virtual machine can allow you to run
multiple operating systems side by side,
you to run multiple Python environments side by side. I don't install
anything in my system Python but rather create a new
virtualenv for each new project or utility I work on.
Now that you have installed
virtualenv, you can play for a
$ mkvirtualenv —no-site-packages testing $ pip install Pinax $ pip freeze|grep Pinax $ pip uninstall Pinax $ pip install —extra-index-url=http://dist.pinaxproject.com/fresh-start/ Pinax==0.9a2.dev1017 $ pip freeze|grep Pinax
Notice that the first
pip installation downloaded and
installed from PyPI.
pip freeze shows all versions of
packages installed in your current
pip uninstall does exactly what you think it does: remove
itself from the
virtualenv. Next, you install a dev version
from the fresh-start repository at http://dist.pinaxproject.com to get the
development version of Pinax version 0.9a2.dev1017.
No going to websites, downloading tarballs, and symlinking code to a site-package. (That is how I used to do it, and it caused many problems.) Your users get all this as a result of good packaging, publishing, and versioning of your project.
The bottom line is that it's well worth your while to spend some time
learning the art and science of packaging. You'll gain greater adoption by
users because of the ease of installation and stability that versioning
your packages gives them. Using the template setup.py provided in Related topics and covered in this
article, you should be able to add packaging to your project quickly and
easily. Communicating to your users through proper versioning is
considerate of your users, making it easy for them to track changes from
release to release. Finally, as
virtualenv gain wider adoption, reliance on published
packages—whether on PyPI or on your own distribution
servers—increases. Therefore, make sure you publish the projects
that you want to share with the world.
I hope that this article has provided enough to get you started. The Related topics should provide documentation to help you dive deeper. If you have questions, don't hesitate to hop onto Freenode and find me in such chat rooms as #pinax and #django-social (with the nickname "paltman") or on Twitter (@paltman).
- Check out PyPI.
- With setuptools 0.6c11, learn how to download, build, install, upgrade, and uninstall Python packages easily.
- Check out PEP 0386 for more information about changing the version comparison module in distributions.
- Learn more about distributing Python modules.
- The Open Source developerWorks zone provides a wealth of information on open source tools and using open source technologies. See other Python-related articles we've published.
- Learn more about and download
- Find the setup.py template used in this article.
- Explore Pinax, an open source platform built on top of Django.