Python testing frameworks: Finding modules to test

The recent emergence of industrial-strength Python testing frameworks means that Python tests are being written more succinctly, more uniformly, and with better reporting of results than ever before. Here we look at how the leading testing frameworks provide robust auto-discovery of your application tests, and how this replaces the fragile central lists of tests that you used to maintain.

Share:

Brandon Craig Rhodes (brandon@rhodesmill.org), Independent Consultant, Rhodes Mill Studios, Inc.

Brandon Craig Rhodes is the Editor-in-Chief of Python Magazine, and an independent web application consultant with more than a decade of experience with the Python language. He has maintained his PyEphem extension module, which provides an object-oriented interface to industrial-grade computational astronomy routines, for more than nine years, and it is used by astronomers on several continents. Brandon also coordinates the Python Atlanta user's group.



02 June 2009

Also available in Chinese

The Python programming community is well-known for their advocacy of unit testing and functional testing. Not only do these practices help assure that components and applications are written right the first time, but that they stay working through months and years of further tweaks and improvements.

This article is the second in a three-part series on modern Python testing frameworks. The first article in this series introduced zope.testing, py.test, and nose, and began to describe how they can change the way that Python projects write and maintain their tests. This second article details the differences in how the three frameworks are invoked, in how they examine a project to discover tests, and how they select which of those tests then get run. Finally, the third article will look at all of the reporting features that have been developed to let testing support more and more powerful techniques.

The dark ages of Python testing

Python project testing was once a very ad-hoc and personal affair. A developer might start by writing each batch of tests as a separate Python script. Later, he might write a script with a name like test_all.py or tests.py that imported and ran all of his tests together. But, however well he automated the process, his approach was unavoidably idiosyncratic: every developer who joined the project had to discover where the test scripts lived and how to invoke them. If a particular Python developer found himself working on, say, a dozen different projects, then he might have a dozen different testing commands to remember.

Danger was incurred by the fact that the test_all.py script, or whatever a particular project called it, probably also did a manual import of all of the other tests. If this central list of tests became out of date, usually because a developer added a new test suite which he ran by hand and forgot to add to the central script, then entire files full of of tests could be omitted from the last test suite runs that were performed before a Python package was put into production.

A final drawback to this testing anarchy is that it required every test file to contain the boilerplate code necessary to run as a separate command. If you look through much Python documentation, or even inside of some Python projects today, you will see dozens of testing examples like this:

# test_old.py - The old way of doing things

import unittest

class TruthTest(unittest.TestCase):
    def testTrue(self):
        assert True == 1

    def testFalse(self):
        assert False == 0

if __name__ == '__main__':
unittest.main()

The first article in this series has already addressed why TestCase class-based testing is often not necessary in the modern world. But now, turn your attention to those last two lines of code: what are they there to accomplish? The answer is that they detect when this test_old.py script has been run stand-alone from the command line, in which case they run a unittest convenience function that will search through the module for tests and run them. This is what lets this file of tests be run separately from the project-wide test script.

Many of the drawbacks to copying identical code like this into dozens, or hundreds, of different test modules should be quite obvious. One drawback that might be less obvious is the lack of standardization that this encourages. If the test_main() function is not quite clever enough to detect the tests of some particular module, then that module will probably be extended with additional behaviors that do not match how the other test suites operate. Each module can therefore wind up with subtly different conventions for how test classes are named, how they operate, and how they are run.


The space age of Python testing

Thanks to the arrival of major Python testing frameworks, all of the problems outlined above have been solved and, as we will see, they have been solved in roughly the same way by each of the frameworks.

To begin with, all three test frameworks provide some standard way of running tests from your operating system command line. This eliminates the need for every Python project to keep a global testing script somewhere in their code base.

The zope.testing package, as one would expect, has the most idiosyncratic mechanism for running tests: since Zope developers tend to set up their projects using buildout, they tend to install their testing script through a zc.recipe.testrunner recipe in the buildout.cfg file. But the result is quite consistent across different projects: in every Zope project that I have ever come across, the development buildout creates a ./bin/test script which can be counted on to invoke the project's tests.

The py.test and nose projects made a more interesting decision. They each offer a command-line tool that completely removes the need for each project to have its own testing command:

# Run "py.test" on the project
# in the current directory...

$ py.test

# Run "nose" on the project
# in the current directory...

$ nosetests

The py.test and nosetests tools even have a few command-line options that they share in common, like the -v option that makes them print the name of each test as it is executed. The day might soon arrive when a Python programmer, simply by being familiar with these two tools, will be able to run the tests of most publicly available Python packages.

But, there is one last level of standardization possible! Most Python projects these days include a top-level setup.py file with their source code that supports commands like:

# Common commands supported by setup.py files

$ python setup.py build
$ python setup.py install

Many Python projects these days use the setuptools package to support extra setup.py commands beyond those available through standard Python, including a test command that runs all of a project's tests:

# If a project's setup.py uses "setuptools"
# then it will provide a "test" command too

$ python setup.py test

\

This would be the pinnacle of standardization: if projects all supported setup.pytest consistently, then developers would be presented with a uniform interface for running the testing suites of all Python packages. Happily, nose supports setup.py by providing an entry point that invokes the same test-running routines as are used by the nosetests command:

# A setup.py file that uses "nose" for testing

from setuptools import setup

setup(
    # ...
    # package metadata
    # ...
    setup_requires = ['nose'],
    test_suite = 'nose.collector',
    )

Of course, most developers will probably keep using nosetests even when their project provides a setup.py entry point, because nosetests provides more powerful command-line options. But for a new developer who just wants to check whether a package is even working on his platform before he tries to track down a bug or add a new feature, a test_suite entry point is a wonderful convenience.


Automatic Python module discovery

A key feature of zope.testing, py.test, and nose is that they all search a project's source code tree in an attempt to find all of its tests without having to have them centrally listed. But the rules by which they discover tests are somewhat different, and might be worth reviewing before making a choice between the frameworks.

The first step that a testing framework takes is to select which directories it will search for files that might contain tests. Note that all three frameworks start in the base directory of your entire project; if you are testing a package called example, then they all start looking for tests in the parent directory that contains example. The three frameworks make somewhat different choices, however, about which directories they search:

  • The zope.testing tool descends recursively into all directories that are Python packages, meaning that they contain an __init__.py file (which is what signals to Python that they can be imported with the import statement). This means that data and code in non-package directories is safe from being inspected, but, on the other hand, this means that every test you write will be one that a programmer could theoretically reach with the import statement if he wanted to. Some programmers will find this uncomfortable, and wish that they could place tests somewhere that made them invisible to normal users of their package.
  • The py.test command descends into every directory and sub-directory in your project, whether a directory looks like a Python package or not. Beware that it appears to have a bug when two adjacent directories share tests of the same name. If, for example, an adjacent dir1/test.py and dir2/test.py file each have a test called test_example, then py.test will run the first test twice and ignore the second test entirely! If you are writing tests for py.test and hiding them beneath non-package directories, be careful that you keep their names unique.
  • The nose test runner implements a middle way between the other two tools: it descends into every Python package, but is only willing to examine directories, if they have the word test in their name. This means that if you want to write “Keep Out” across a directory, so that nose will not try delving into it to find tests, you can just be careful not to include the word test in its name. Unlike py.test, nose operates correctly even if adjacent directories contain tests with the same name (but it is still helpful to keep test names distinct, so that you can tell which is which when looking at test results with the -v option).

Once they have selected which directories to search, the three testing tools have remarkably similar behavior: they all look for Python modules (that is, files ending with .py) matching some specific pattern. The zope.testing tool by default uses the regular expression "tests", which will only find files named tests.py and ignore all others. You can either use a command-line option or your buildout.cfg to specify an alternative regular expression:

# Snippet of a buildout.cfg file that searches for tests
# in any Python module starting with "test" or "ftest".

[test]
recipe = zc.recipe.testrunner
eggs = my_package
defaults = ['--tests-pattern', 'f?test']

The py.test tools is more rigid, and always looks for Python modules whose names either start with `test_ or end with _test. The nosetests command is more flexible, and uses a regular expression (which, for the curious, is “((?:^|[\b_\.-])[Tt]est)”) that selects any modules that either start with the word test or Test, or have that word following a word boundary. You can specify a different regular expression either at the command line with the -m option, or by setting the option in your project's .noserc file.

Which approach is best? While some developers always prefer flexibility and, really, it would be a terrible hassle if the zope.testing tool's interests could not be broadened out to more modules than just ones with a tests.py filename, I actually prefer py.test in this case. All projects that use py.test will necessarily share a common convention for how their tests are named, making them easier to read and maintain for other programmers. When either of the other frameworks is used instead, then reading or creating test files will take two steps: first, you have to go look at what this particular project uses as the regular expression, and only then can you start usefully inspecting its code. And if you are working on several projects at once, then you might find yourself having to keep up with several different test-file naming conventions simultaneously.


Including docfiles and doctests in a test suite

The obtrusive three-chevron Python prompt, >>>, has wound up being a wonderfully obvious signal, in a longer document, that the writing is illustrating what should happen at the Python prompt. This can occur in stand-alone text files that are acting as documentation, as we saw in the first article in this series:

Doctest for truth and falsehood
-------------------------------

The truth values in Python, named "True" and "False",
are equivalent to the Boolean numbers one and zero.

>>> True == 1
True
>>> False == 0
True

Such illustrations can also occur right inside of source code, in the docstring of a module, class, or function:

def count_vowels(s):
    """Count the number of vowels in a string.

    >>> count_vowels('aardvark')
    3
    >>> count_vowels('THX')
    0

    """
    return len( c for c in s if c in 'aeiou')

When these tests occur in a text file, as in the first example, then the file is called a docfile. When they occur inside of docstrings inside of Python source code, like in the second example, they are called doctests.

Since docfiles and doctests are very common ways to write documentation that itself serves as a test (and that will also signal when it has gone out of date and become incorrect), they are directly supported by both py.test and nose. (Users of zope.testing will have to manually create Python test cases for each file by using the DocTestSuite class from the standard doctest module.)

As with its rules for finding test modules, the py.test framework has fixed procedures for supporting doctests that do not seem to be configurable, choosing standardization across projects rather than flexibility within particular ones.

  • If you activate its -pdoctest plugin, then it will look for doctests both in the documentation strings of all of your Python modules (even the ones without the word test in their names), and also in any text files whose names both start with test_ and end with the .txt extension.
  • If you activate its -prestdoc plugin, then not only do any doctests in your .txt files get tested, but py.test will insist that every .txt file in your project be a valid Restructured Text file, and will complain if any of them cause parsing errors. The plugin can also, through further command line options, be asked to check any URLs that you specify in your documentation, and to go ahead and generate HTML versions of each of your .txt documentation files.

A very similar set of features are supported by nose, but, as you are probably guessing, it provides a bit more flexibility.

  • Choosing --doctest-tests is the least intrusive option, and simply asks nose to watch out for doctests in the docstrings of the test modules that it is already examining.
  • The --with-doctest option is more aggressive, and asks nose to look through all of your normal modules, the ones that it does not think are tests, but that contain normal code and find and run any doctests that occur inside of their docstrings.
  • Finally, --doctest-extension lets you specify a filename extension of your own choice (most developers I know would choose .txt or .rst or perhaps .doctest). This asks nose to read through all text files in your project with the given extension, running and verifying any doctests that it finds.

Although py.test and nose have a quite similar feature set here, I actually prefer the approach of nose. I like using the non-standard .rst extension for all of my Restructured Text files, so that I can teach my text editor to recognize them and give them special syntax highlighting.


The nose framework and executable modules

There is one caveat that should be made about the nose framework: it will, by default, avoid Python modules that are marked as executable. (You can mark a file as an executable command on Linux® with something like a chmod+x command.) The nose framework ignores such files because modules that are designed to be run straight from the command line might perform actions that make them unsafe to import.

You can make commands safe to import, however, by protecting the actual actions they perform with an if statement that checks whether the module is being run or simply imported:

#!/usr/bin/env python
# Sample Python command

if __name__ == '__main__':
print "This has been run from the command line!"

If you take this precaution in every one of your commands, and therefore know them to be safe to import, then you can give nose the --exe command-line option and it will examine such modules anyway.

I actually prefer the behavior of py.test in this case: by ignoring whether Python modules are executable, it winds up with rules that are simpler than those of nose, and that force good practices (like protecting command logic with an if statement). But if you want to experiment with using a test framework against a legacy application perhaps containing dozens of executable modules of whose code quality you are unsure, then nose does seem like the safer tool.


Conclusion

The article has now covered all of the details about how these three Python testing frameworks examine your code base and select which modules they think contain tests. By providing automated discovery based on uniform conventions, the choice of any of the three major testing frameworks will both help you write more consistent test suites that can be detected and examined by machine. But what do the web frameworks do next? What do they look for inside of those modules? That is the topic of the third article in this series!

Resources

Learn

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into AIX and Unix on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=AIX and UNIX
ArticleID=392899
ArticleTitle=Python testing frameworks: Finding modules to test
publish-date=06022009