Creating a successful open source Python project involves more than just writing useful code. It's about community engagement, increasing cooperation opportunities, craftsmanship, and support. Explore best practices to help you create your own successful project.

Patrick T. Altman, VP of Engineering, Eldarion

Photo of Patrick AltmanPatrick Altman is a core developer on Pinax and has created and contributes to many other open source projects. He is currently VP of Engineering at Eldarion. Previously, he was the Principal Software Engineer at StudioNow, which was later sold to AOL. He currently resides in Nashville, Tennessee, with his wife and three children.



10 January 2012

Also available in Chinese Russian Japanese Portuguese

The ecosystem for open source Python projects is both rich and diverse. This enables you to stand on the shoulders of giants in the production of your next open source project. In addition, it means that there's a set of community norms and best practices. By adhering to these conventions and applying the practices in your project, you may gain wider adoption for your software.

This article covers practices that have worked well for building large and small projects that have gained wide user communities. The suggestions offered here are reasonable and make sense. However, because results may vary, do not consider them strict dogma.

First let's discuss how decoupling processes can lead to a stronger community, with more throughput across the spectrum of writing, maintaining, and supporting open source software.

Collaboration versus cooperation

During DjangoCon 2011, David Eaves gave a keynote address that eloquently put into words the notion that although collaboration and cooperation have similar definitions, there is a subtle difference:

"I would argue that collaboration, unlike cooperation, requires the parties involved in a project jointly to solve problems."

Eaves goes on to devote an entire post specifically to how GitHub was the driving force for innovating how open source works—specifically, the aspect of community management. In "How GitHub Saved OpenSource" (see Resources), Eaves states:

"I believe open source projects work best when contributors are able to engage in low transaction cost cooperation and high transaction cost collaboration is minimized. The genius of open source is that it does not require a group to debate every issue and work on problems collectively, quite the opposite."

He goes on to talk about the value of forking and how it reduces the high costs of collaboration by enabling low-cost cooperation among people able to take projects forward without permission. This forking pushes off the need for coordination until solutions are ready to be merged in, enabling much more rapid and dynamic experimentation.

You can shape your project in similar ways, with the same goal of increasing low-cost cooperation while minimizing expensive collaboration throughout writing, maintaining, and supporting your project.


Writing

Starting with a blank slate, you are creating something fresh and new, you're making something innovative—or maybe just something slightly different from what exists already. There is nothing like starting a new project and sharing with the world the product of your efforts.

Unlike maintaining, when you are writing, you are creating something new rather than modifying or fixing something that exists. Writing and crafting a project is an art form in addition to a science. Others will see the implementation and make judgments about the quality of the code, and your name is on it forever.

Therefore, it's important to understand the mindset of a craftsman and approach writing software accordingly. Writing your new project also means more than generating code: The creation and crafting of your project includes writing beautifully styled code that is a pleasure to read, creating tests that validate functionality in your project where appropriate, and producing thorough and helpful documentation.

Craftsmanship

Craft generally refers to an art trade or occupation requiring special skill to make something by hand, usually a physical object made in small-scale production. You can stretch this definition to apply to software in the sense that a software craftsman focuses on quality rather than on volume.

For the craftsman, it is important that the product be appealing, not just functional. Specifically, in software, a craftsman works to make sure that the code is clean and aesthetically pleasing, that application programming interfaces (APIs) are beautiful, that documentation and tests give users a sense of working with a solid product.

Working in this mindset is rewarding for the soul and the source of a lot of the enjoyment of producing open source software: You are free from answering to deadlines, clients, and other external demands. Take your time and enjoy making something beautiful.

Code style and linting

Python Enhancement Proposal (PEP) 8 (see Resources) is a detailed Python style guide that you should base your Python project on (or at least your project's style guide). It's not important to be dogmatic about PEP 8, but the closer to PEP 8 your work is, the easier it is for other Python developers to submit clean patches in the standard Python community style.

In addition to style conformance, the concept of code linting is valuable in catching errors such as missing imports and undefined variables. There are several linters, or tools, in addition to style checkers that will help you inspect your code to find deviations from a default set of rules or rules you configured. The most popular utilities are:

  • pyflakes
  • pylint
  • pep8

See Resources for links to these tools.

Whatever set of conventions you choose to adhere to, if they deviate from PEP 8, I recommend documenting them to make those who want to contribute to your project aware of the coding style you employ. Better to be explicit than implicit.

pyflakes is a particularly useful linter. It's a good balance of useful functionality, catching and highlighting errors without being excessive over minor eccentricities. Here's an example session of using pyflakes on a Python project:

$ pyflakes kaleo
kaleo/forms.py:1: 'form' imported but unused
kaleo/forms.py:4: undefined name 'forms'
kaleo/forms.py:6: undefined name 'forms'

Right away, the tool tells me that there is an import typo. Looking in kaleo/forms.py, I see:

1: from django import form
2: 
3: class InviteForm(forms.Form):
4:    email_address = forms.EmailField()

...which tells me to change line 1 to from django import forms.

Tests

It's always good to have tests in your project that validate that your code works, to prevent regressions going unnoticed, and in some cases to serve as a form of documentation, where reading the test code can inform others of how the API of your library works.

That said, I wouldn't judge a project's completeness or viability on whether it includes tests or how complete those tests are. The presence of tests doesn't guarantee the quality of the code. It might be a controversial idea, but I believe that it's better to have no tests at all than to test the wrong thing. When writing tests it is important to consider putting a variety of inputs for each unit under test.

Documentation

Unlike tests, however, you can judge the quality and craftsmanship of a project based on the quality and extensiveness of its documentation. Approach authoring and maintaining your documentation the same way you approach your code. Well-written and in-depth documentation will inspire contributors to follow suit and make your project more approachable for users.

Using tools such as Sphinx and Read the Docs (see Resources), you can have published, up-to-date docs that look fantastic. Using these tools is as simple as writing the words and pushing the commits. Get into the habit of committing doc changes with commits as much as is appropriate.


Maintaining

After you have released the first version on the Python Package Index (PyPI) and have announced it through various Tweets and blog posts and you start getting some users, you need to add maintenance to any continued authorship activities. Users will report bugs, request features, ask questions that are not obvious in the documentation, and more.

Some things you'll choose not to do and suggest workarounds; but others you'll want to fix either in the documentation or in the code. Using a distributed version control system (DVCS) like git and releasing frequent developer packages can make maintenance much less of a chore.

Source control

Many DVCSs are available, including git and mercurial (see Resources). Whichever version control system you choose, make sure it offers source control, which gives you the ability to have users fork your project and work on bugs themselves.

Pull requests

A pull request is a message sent to other forks—typical the parent fork—of one's repository requesting that the owner of that repository pull a batch of changes, or commits. Because the pull request process increases cooperation while reducing the cost of collaboration, the innovation cycle can accelerate.

The rate at which changes are made depends on many factors, a critical one being the target audience (for example, other developers, nontechnical end users). If you're writing for developers, then encouraging bug reports or feature requests to come with pull requests can really lower the burden for the maintainer. It also increases the sense of community, as people have their contributions merged into future releases.

Dev builds

You will want to put out dev releases early and often, many times after each additional set of patches. This will allow other developers using your project in their work to run against the latest changes in your project with greater ease. The more people using the code in different situations, the higher the quality will be once it comes time to release a new stable version.


Supporting

Support goes with maintenance. It is critical to engage and build a community out of your users and contributors. Empower others to help you with support, and you are increasing your project's overall cooperation factor, enabling better scalability in your project's size as well as the natural increase in ideas for solving users' problems.

To that end, be sure to offer multiple channels to increase reach and make it easier for users to engage you and the project. Channel options include IRC, mailing lists, and social media venues such as Twitter.

IRC

Setting up a channel on an IRC such as freenode is a good idea. I set one up for my project, nashvegas; and although it's rare that there's a user other than me, my IRC client runs unobtrusively in the background. When the occasional user has a question, I have been able to respond with little transactional cost to me and in a much more dynamic way than is available through email.

Mailing list

It's standard practice for most open source projects to have a mailing list for support as well as discussing development progress amongst contributors. I recommend keeping it to one mailing list, splitting it into "users" and "dev" lists only when the volume gets so high it causes noise for one group or the other.

Twitter

Get a Twitter handle for your project where people can publicly talk to you about your work. A Twitter account can also serve as a good place to make project announcements.


Conclusion

Writing and contributing to open source software in the Python community can be a fun and rewarding experience. Focusing on reducing high-cost collaboration while increasing opportunities for low-cost cooperation can help your projects grow with active contributors. In open source, you have a lot of freedom to be a craftsman when it comes to your project: Make the most of this and enjoy it. Focus on consistent code style, solid tests, and well-written documentation to improve the adoption rate of your project by users and other developers. In addition, use a DVCS, be attentive to pull requests, and publish frequent development releases. Finally, you can further increase the adoption and growth of your project by providing multiple channels of support and enabling the community to assist you in providing that support.

Resources

Learn

Get products and technologies

  • Many linters are available to use. Popular choices include pyflakes, pylint, and the Python style guide checker pep8.
  • When you're ready to add the documentation for your project, several online tools can make the job easier. Two such options are Sphinx and Read the Docs.
  • git and Mercurial both make excellent DVCSs.

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Select information in your profile (name, country/region, and company) is displayed to the public and will accompany any content you post. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Open source on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Open source
ArticleID=784033
ArticleTitle=Create successful Python projects
publish-date=01102012