A number of popular toys are based on the concept of simple building units that can be combined in many ways to construct new -- and sometimes unexpected -- creations. This same concept applies to real-world construction, where basic materials are combined to build useful objects. Common materials, techniques, and tools simplify the development of new buildings and also simplify the training of those entering the field.
The same underlying concept applies to the art of developing computer programs, including programs written in the Python programming language. This article discusses how to use Python to create basic building blocks that can be used to solve more complex problems. These building blocks can be small and simple or large and complex. Either way, the name of the game is to define the building blocks and then use them to create your own masterpiece.
Throughout the previous articles in this series, you generally had to re-enter any code, even if it was identical to a previous line of code. The only exception to this requirement has been the use of variables: You initialize the contents of a variable once, then you can reuse it as often as needed. Clearly, it would be useful to do this more often.
One of the most popular aphorisms used to describe good programmers is that they're lazy. This doesn't mean that good programmers don't work hard -- but they like to work smart and never redo something unless absolutely necessary. This means you need to start looking at ways to reuse previously written code. In Python, there are many ways to do this, but the simplest technique is to use functions, which also are known as methods or subroutines.
Like most modern programming languages, Python supports the use of methods to encapsulate a set of statements that can be used over and over as necessary. Listing 1 presents a simple, pseudocode outline of how you can write a method in Python.
Listing 1. Pseudocode for defining a function
def myFunction(optional input data): initialize any local data actual statements that do the work optionally return any results
As you can see, a function in Python basically consists of wrapper code indicating that a series of Python statements will be reused. The function can take input parameters, which are supplied in parentheses that follow the function name (in this case,
myFunction). The function can also return a value (or, more formally, an object), including a Python container like a
Before building a real function, let's review several points about the pseudocode that are important, but easy to overlook:
- Notice the character case used in the function name: most characters are lowercase, but when multiple words are joined in the name, the first letter of each new word is capitalized (for example, the F in
myFunction). This approach is known as camel casing and is a popular technique used in Python (and other languages) to make it easier to read the names of functions.
- The program statements inside the function definition are indented. The body of a function consists of a block of Python statements; they must be indented just like the body of a loop or conditional statement.
- The first line of a function definition, also known as the method signature, begins with
def(shorthand for define).
- The method signature ends with a colon, which indicates that the body of the function follows on subsequent lines.
At this point, you're probably convinced of the benefits of using methods. So, let's jump in and write a function. "Discover Python, Part 6: Programming in Python, For the fun of it" used a
for loop to create a times table. Listing 2 shows the same concept, but in this case, you create a function that encapsulates the logic behind this calculation.
Listing 2. A first function
>>> def timesTable(): ... for row in range(1, 6): ... for col in range(1, 6): ... print "%3d " % (row * col), ... print ... >>> timesTable() 1 2 3 4 5 2 4 6 8 10 3 6 9 12 15 4 8 12 16 20 5 10 15 20 25 >>> t = timesTable >>> type(t) <type 'function'> >>> t <function timesTable at 0x64c30> >>> t() 1 2 3 4 5 2 4 6 8 10 3 6 9 12 15 4 8 12 16 20 5 10 15 20 25
timesTable function is easily defined, taking no input parameters and returning no results. The function body is almost identical to the statements shown in "Discover Python, Part 6" (the previous times table went from 1 to 10). To invoke, or call the method and have it do its work -- you enter the name of the function, followed by parentheses. In this case, the times table is printed.
In Python, a function is a first-class object, like an integer variable or a container object. As a result, you can assign a function to a variable (remember that in Python, variables are dynamically typed). In Listing 2, you assign the
timesTable function to the variable
t. The next two lines demonstrate that the variable
t really does point to a function. Finally, you invoke the
timesTable function using the variable
timesTable function shown in Listing 2 is neither complex nor particularly useful. A more useful example will let you specify the number of rows and the number of columns that should be included in the generated times table -- in other words, dynamically change the way the function operates when it's called. You can do this by using two input parameters in the function definition, as shown in Listing 3.
Listing 3. A better times-table function
>>> def timesTable2(nrows=5, ncols=5): ... for row in range(1, nrows + 1): ... for cols in range(1, ncols + 1): ... print "%3d " % (row * cols), ... print ... >>> timesTable2(4, 6) 1 2 3 4 5 6 2 4 6 8 10 12 3 6 9 12 15 18 4 8 12 16 20 24 >>> timesTable2() 1 2 3 4 5 2 4 6 8 10 3 6 9 12 15 4 8 12 16 20 5 10 15 20 25 >>> timesTable2(ncols=3) 1 2 3 2 4 6 3 6 9 4 8 12 5 10 15
The definitions of the two times-table functions are remarkably close, but the new version is considerably more useful (as you can see from the three invocations in Listing 3). This additional power was simple to include in the function: You provide input parameters named
ncols that allow the size of the times table to be changed when the function is invoked. These two parameters are then supplied to the two
for loops that generate the times table.
One other important point about the
timesTable2 function is the presence of default values for the two input parameters. You provide a default value for a parameter in the function signature by appending an equal sign and the value to the parameter name, like this:
nrows=5. A default parameter provides greater flexibility to a program because you can include neither, one, or both of the input parameters when you call the function. However, this approach can cause problems. If you don't specify all of the parameters during a function invocation; you must explicitly name the parameters you're specifying so the Python interpreter can properly call the function. This is demonstrated in the last function invocation, which explicitly calls the
timesTable2 function with
ncols=3; the function creates a times table with five rows (the default value) and three columns (the supplied value).
A times table isn't the most common need for a method. You'll probably want to perform a calculation and return the resulting value to the calling code. These two approaches are sometimes differentiated by calling methods that don't return any data (subroutines) and methods that do return a value (functions). However, in Python, you don't need to worry about these semantics because the two approaches are accomplished in an almost identical manner using the
return statement (see Listing 4).
Listing 4. Returning a value in a function
>>> def stats(data): ... sum = 0.0 ... for value in data: ... sum += value ... return (sum/len(data)) ... >>> stats([1, 2, 3, 4, 5]) # Find the mean value from a list 3.0 >>> stats((1, 2, 3, 4, 5)) # Find the mean value from a tuple 3.0 >>> stats() Traceback (most recent call last): File "<stdin>", line 1, in ? TypeError: stats() takes exactly 1 argument (0 given) >>> stats("12345") Traceback (most recent call last): File "<stdin>", line 1, in ? File "<stdin>", line 4, in stats TypeError: unsupported operand type(s) for +=: 'float' and 'str'
This simple function steps through the data (which is assumed to be a Python container holding numerical data), calculates the mean value of the set of data, then returns the value. The function definition takes one input parameter. The mean value is passed back via the
return statement. The returned value is displayed to the screen when you call the function with a
tuple holding the numbers 1-5 inclusive. Calling the function with no parameters, with a noncontainer data type, or with a container that holds non-numeric data results in an error. (Throwing an error condition like this makes sense. A more advanced treatment would include proper error-checking and handling to handle these conditions, but that is beyond the scope of this article.)
Although useful, this example can be more powerful, as shown in Listing 5. In Python, functions can return any valid object type, including a container type. Therefore, you can compute multiple quantities and easily return them to the calling statement.
Listing 5. Returning compound values
>>> def stats(data): ... sum = 0.0 ... for value in data: ... sum += value ... mean = sum/len(data) ... sum = 0.0 ... for value in data: ... sum += (value - mean)**2 ... variance = sum/(len(data) - 1) ... return (mean, variance) ... >>> stats([1, 2, 3, 4, 5]) (3.0, 2.5) >>> (m, v) = stats([1, 2, 3, 4, 5, 6, 7, 8, 9]) >>> print m, v 5.0 7.5
To return multiple values from a function, enclose them in parentheses and separate them with commas -- in other words, create and return a
tuple. The body of the new
stats function is modified slightly to calculate the variance of the numerical sequence. Finally, as shown in the two invocations of the
stats function, the
tuple values can be accessed as a
tuple or unpacked into their constituent parts.
By now, you're probably sold on reusing code. But even with the use of functions, you still need to re-enter the body of the function whenever you want to use it. For example, when you open a new Python interpreter, you must type in all the functions previously created. Fortunately, you can use modules to encapsulate related functions (and other Python objects) together, save them to a file, and import these already-defined functions into your new Python code, including into a Python interpreter.
To demonstrate using modules in Python, you'll reuse the
stats method, shown in Listing 5. You have two options: You can extract the file called test.py from the zip file associated with this article or you can type the function into an editor and save the file as test.py. Once you've done so, start a new Python interpreter in the directory where you saved test.py, and enter the Python statements shown in Listing 6.
Listing 6. Working with modules
>>> import test >>> test.stats([1, 2, 3, 4, 5, 6, 7, 8, 9]) (5.0, 7.5) >>> from test import stats >>> (m, v) = stats([1, 2, 3, 4, 5, 6, 7, 8, 9]) >>> print m, v 5.0 7.5
The first line,
import test, opens the file test.py and processes each statement in that file. Here, this merely defines the
stats function, but you could do more if you desired. When you call the
stats function, you prefix it with the module name, which is
test. The complex name is necessary due to scope, which refers to the visibility of names in a program. Putting things in a module makes them have the scope of that module. To tell Python which
stats method you want to call, you must provide the full name. This can be important because you may have multiple objects with the same name. Scope rules help Python figure out which object you want to use.
The third line,
from test import stats, also opens the file test.py, but it implicitly brings the
stats method into the current file's scope, allowing you to call the
stats function directly (without using the module's name). Judicious use of the
from ... import ... syntax can make your program more succinct, but overuse can cause confusion -- or, even worse, scope conflict errors. Don't abuse your newfound power!
A principal benefit of using the Python programming language is the large built-in standard library, accessible as Python modules. Examples of commonly used modules include:
mathcontains useful mathematical functions.
syscontains data and methods for interacting with the Python interpreter.
arraycontains array datatypes and related functions.
datetimecontains useful date and time manipulation functions.
Because these are built-in modules, you can use the help interpreter to learn more about them, as demonstrated in Listing 7.
Listing 7. Getting help for the math module
>>> help(math) Traceback (most recent call last): File "<stdin>", line 1, in ? NameError: name 'math' is not defined >>> import math # Need to import math module in order to use it >>> help(math) Help on module math: NAME math FILE /System/Library/Frameworks/Python.framework/Versions/2.4/lib/ python2.4/lib-dynload/math.so DESCRIPTION This module is always available. It provides access to the mathematical functions defined by the C standard. FUNCTIONS acos(...) acos(x) Return the arc cosine (measured in radians) of x. asin(...) asin(x) Return the arc sine (measured in radians) of x. ...
The help output for the
math module shows the wide range of mathematical functions supported, including the
sqrt function. You can use it to turn your variance measurement into a standard deviation measurement, as shown in Listing 8.
Listing 8. Working with multiple modules
>>> from math import sqrt >>> from test import stats >>> (m, v) = stats([1, 2, 3, 4, 5, 6, 7, 8, 9]) >>> print m, sqrt(v) 5.0 2.73861278753
As you can see, you can import multiple modules into a Python program. Between the large, built-in module library and the even larger number of community libraries (many of which are open-source), you can quickly became a lazy -- that is, good -- programmer.
When a module is imported, the Python interpreter processes each line in the module file. In fact, you can invoke the Python interpreter to only process a Python program contained in a file. And on UNIX®-based operating systems, you can easily set up the file to be executable, as shown in Listing 9.
Listing 9. A complete Python program
#!/usr/bin/env python def stats(data): sum = 0.0 for value in data: sum += value mean = sum/len(data) sum = 0 for value in data: sum += (value - mean)**2 variance = sum/(len(data) - 1) return(mean, variance) (m, v) = stats([1, 2, 3, 4, 5, 6, 7, 8, 9]) print "The mean and variance of the values " \ "from 1 to 9 inclusive are ",m, v
Looking at this example, you should get a good feeling for how simple it is to put a Python program in a file and have it run. The only difference between this example and the code in the test.py file is the inclusion of the first line. On UNIX-based operating systems, this line causes the Python interpreter to start up automatically and process the statements in the file before terminating. The rest of the lines in the example define the
stats function, call the function, and print out the results.
To run the statements in this file, you need to start a Python interpreter and tell it to read and process the contents of the file. To do so, you must either first enter the example in Listing 9 into a file named mystats.py or extract the file from the zip file associated with this article. Change to the directory containing this file, and then follow the commands shown in Listing 10. Note that for a Microsoft® Windows® operating system, you should only follow the first command; the others are for a UNIX operating system like Linux® or Mac OS X.
Listing 10. Executing a Python program
rb% python mystats.py The mean and variance of the values from 1 to 9 inclusive are 5.0 7.5 rb% chmod +x mystats.py rb% ./mystats.py The mean and variance of the values from 1 to 9 inclusive are 5.0 7.5
The commands in Listing 10 show how to run a Python program contained in a file. The first command, calling the Python interpreter with the name of the file, works on any operating system on which Python is installed and where the Python interpreter is in the PATH. The second command,
chmod, makes the file containing the Python program executable. The third line tells the operating system to run the program. This is made possible by using the
env program, which is an operating system independent technique for locating and running a program -- in this case, the Python interpreter.
This article explained how to write reusable code in Python. It discussed how to use methods, or reusable blocks of code, in a Python program. Methods can take input parameters and also return data, including container datatypes. Together, this functionality makes using methods a powerful way to tackle a range of problems. The article also discussed modules, which let you group related methods and data together into an organized hierarchy that can be reused easily in other Python programs. Finally, you saw how to put it all together to create a fully functioning, stand-alone Python program. You have seen that reusing code means reducing your workload. And when it comes to programmers, being lazy can be a virtue, not a vice.
|Sample Python files||os-python9-pyrjb929730.zip||663KB||HTTP|
Find all of the articles in the developerWorks "Discover Python" series.
When you have a working Python interpreter, the Python tutorial is a great place to start learning the language.
Visit the developerWorks Open source zone for extensive how-to information, tools, and project updates to help you develop with open source technologies and use them with IBM's products.
Get products and technologies
You can download Python.
Innovate your next open source development project with IBM trial software, available for download or on DVD.
Get involved in the developerWorks community by participating in developerWorks blogs.
Robert J. Brunner is a Research Scientist at the National Center for Supercomputing Applications and an Assistant Professor of Astronomy at the University of Illinois, Urbana-Champaign. He has published several books and a number of articles and tutorials on a range of topics. You can reach him at firstname.lastname@example.org.