Skip to main content

If you don't have an IBM ID and password, register here.

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. This profile includes the first name, last name, and display name you identified when you registered with developerWorks. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

All information submitted is secure.

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

Charming Python: Getting version 2.0

The new features of the latest Python version

David Mertz (mertz@gnosis.cx), Applied Metaphysician, Gnosis Software, Inc.
Since conceptions without intuitions are empty, and intuitions without conceptions, blind, David Mertz wants a cast sculpture of Milton for his office. Start planning for his birthday. David may be reached at mertz@gnosis.cx; his life pored over at http://gnosis.cx/dW/. Suggestions and recommendations on this, past, or future, columns are welcomed.

Summary:  Python programmers have recently acquired a shiny new toy with the release of version 2.0. Python 2.0 builds on the strengths of previous Python versions, while adding a number of new conveniences and capabilities. This article contains ts author's impressions of Python's newest version, and some tips on using it effectively.

Date:  01 Feb 2001
Level:  Introductory

Comments:  

Released in October 2000, Python 2.0 introduces a number of new language features and includes some new standard modules. One of Guido van Rossum's virtues -- probably the one that best earns him the affectionate title "benevolent dictator for life (BDFL)" in the Python community -- is his conservatism in changing Python. He supports very few changes between Python versions, and what does change tends to be considered and discussed for months or years in advance. This makes for great backward and forward compatibility in Python, and for a consistency in running Python programs across platforms and versions. That said, Python 2.0 represents a pretty large jump in the language definition of Python 1.5x. Fortunately, Python 2.0 still maintains great backward compatibility, and the changes that have been made are generally very "pythonic" in character.

By the way, it is worth noting that a short-lived Python 1.6 was released in September 2000. This release is a bit of a curiosity -- its existence derives from contractual obligations by the Python core development team, who were finding a new organizational home during the same period as the 1.6/2.0 development. For the most part Python 1.6 resembles Python 2.0, but if you are installing a new version, it is better to install Python 2.0.

Check the Resources for an exhaustive summary of changes. This article contains a subjective evaluation of what I find most important and interesting; some of the changes that interest you might not be addressed here.

List comprehensions and zip()

For me, probably the most exciting new feature of Python 2.0 is the addition of list comprehensions. (Any oddball readers with a math background may note that this capability is sometimes called "ZF-comprehension" in other functional languages, after the Axiom of Comprehension in Zermelo-Frankel set theory.)

Most readers will note an odd expression in the previous paragraph: "other functional languages." As a Python programmer you have been programming in a (mixed) functional language since Python 1.0. Of course, if you are not in the habit of using the built-in functions lambda(), map(), reduce(), and filter(), you have not been using these functional features. But even if you do use these, Python has always made it easy to avoid thinking about functional paradigms.

In any case, list comprehension is a way of doing much of what Python's functional built-ins do, but in a much more compact way that is simultaneously easier to read and understand. Let's start out with a simple example of list comprehensions in action:

Example of list comprehensions

>>> xs = (1,2,3,4,5)
>>> ys = (9,8,7,6,5)
>>> bigmuls = [(x,y) for x in xs for y in ys if x*y > 25]
>>> print bigmuls
[(3,9), (4,9), (4,8), (4,7), (5,9), (5,8), (5,7), (5,6)]

In the example we created a list of tuples where each tuple element is drawn from other lists, and where each list element satisfies some property. Without the if clause, we would just create a permutation (which is often useful in itself); but with the if clause we can create a filter() type pruning of the list. Multiple if clauses are allowed in one list comprehension, by the way.

There is nothing fundamentally new in list comprehension capability; certainly the same effect could be achieved in Python 1.5x, but less clearly. For example, the following more verbose (and less clear) techniques can do the same thing:

Comparison of techniques in version 1.5x

>>> # Functional-style spaghetti for list comprehension
>>> filter(lambda (x,y): x*y > 25,
.....        map(None, xs*len(ys),
.....                  reduce(lambda s,t: s+t,
.....                         map(lambda y: [y]*len(xs), ys))))
[(3, 9), (4, 9), (5, 9), (4, 8), (5, 8), (4, 7), (5, 7), (5, 6)]

>>> # Nested loop procedural style for list comprehension
>>> bigmuls = []
>>> for x in xs:
.....     for y in ys:
.....         if x*y > 25:
.....             bigmuls.append((x,y))
>>> print bigmuls
[(3, 9), (4, 9), (4, 8), (4, 7), (5, 9), (5, 8), (5, 7), (5, 6)]

In the example I have given, the nested procedural loops are clearer than the functional-style calls (perhaps readers will notice a better functional approach). But both are far less clear than the list comprehension style.

With some programmer practice, list comprehensions can substitute for most uses of functional-style built-ins and also for many nested loops.

One new built-in function in Python 2.0 is particularly useful in conjunction with list comprehensions. You can think of what zip() does by imagining the teeth of a zipper: two or more sequences are combined into a list of tuples (with each tuple having one element from each calling sequence). This is often useful if you do not want a list comprehension that uses a complete permutation of lists, but merely one that utilizes corresponding elements of multiple lists. For example:

The zip() function

>>> zip(xs,ys)
[(1, 9), (2, 8), (3, 7), (4, 6), (5, 5)]
>>> [(x,y) for (x,y) in zip(xs, ys) if x*y > 20]
[(3, 7), (4, 6), (5, 5)]


Unicode support

Another big addition for Python 2.0 is Unicode support. If you need to use multinational character sets in your programs, this capability is absolutely essential. Of course, if, like me, you have not had any specific requirement for characters outside ASCII, the Unicode support does not really matter. Fortunately, the implementation of Unicode in Python 2.0 is extremely well designed, and does not get in the way of anything else.

Unicode strings may be represented in several ways. For escaped characters, the sequence "\uHHHH" can be used, where HHHH is a four-digit hexadecimal number. Longer strings can be entered using the new Unicode quoting syntax: u"string". This is very similar in style to the r"string" quoting style used for composing regular expressions without resolving escape codes at the Python level (because regular expressions use some of the same escape codes). Of course, to use the Unicode quoting syntax, you need to have a text editor capable of entering Unicode characters between the quotes.

Conversion between 8-bit strings and Unicode strings -- and also between different Unicode encodings -- is performed using the new codecs module.


Functional/method call syntax

Another nice syntax enhancement was made to function calls. It is now possible to call functions directly with a tuple of arguments and/or a dictionary of keyword arguments. As with list comprehensions, no fundamentally new capability is added, but the expression of some common chores is clearer and more concise. Methods in Python, of course, are just functions that are bound to class instances, so everything works the same for functions and methods.

Python programmers will be familiar with the previous syntax for defining extra positional and keyword arguments within a function definition. For example, we might have:

Defining extra positional and keyword arguments

>>> def myfunc(this, that, *extras, **keywords):
....     print "Required arguments:", this, that
....     print "Extra arguments:",
....     for arg in extras: print arg,
....     print "\nDictionary arguments:"
....     for key,val in keywords.items(): print "**", key, "=", val
....
>>> myfunc(1)
Traceback (innermost last):
  File "<interactive input>", line 1, in ?
TypeError: not enough arguments; expected 2, got 1
>>> myfunc(1,2)
Required arguments: 1 2
Extra arguments:
Dictionary arguments:
>>> myfunc(1,2,3,4,5)
Required arguments: 1 2
Extra arguments: 3 4 5
Dictionary arguments:
>>> myfunc(1,2,3, spam=17, eggs=32)
Required arguments: 1 2
Extra arguments: 3
Dictionary arguments:
** spam = 17
** eggs = 32

Python 2.0 adds the same convention for function calls as is used for function definitions. For example:

Convention for function calls

>>> argdict = {'spam':'tasty', 'eggs':'over easy'}
>>> arglist = [1,2,3,4,5]
>>> myfunc(*arglist, **argdict)
Required arguments: 1 2
Extra arguments: 3 4 5
Dictionary arguments:
** spam = tasty
** eggs = over easy

Achieving the same effect (passing arguments via named lists, perhaps ones created dynamically at runtime) was always possible in Python. But the new calling syntax is more convenient than the old use of the apply() function.


Augmented assignment

Python now has a shortcut in assignments that will be familiar to programmers of C, Perl, Awk, Java, and a variety of other languages. It is now possible to stick an operator at the beginning of an equal sign to change the assigned value of a variable based on its old value. For example:

New shortcut in assignments

>>> i = 1
>>> i += 1 ; i
2
>>> i *= 3 ; i
6
>>> i /= 2 ; i
3
>>> str = "Spam and eggs"
>>> str += "...and sausage and spam and bacon" ; str
'Spam and eggs...and sausage and spam and bacon'

Semantically, the augmented operators do exactly the same thing as repeating the left-side variable on the left side of a plain assignment, and following it with the corresponding operator and second operand. So in that sense, this is just syntactic sugar.

Notice, however, that the augmented assignments are actually an improvement in terms of performance. I have not benchmarked it myself, but discussion suggests that using an augmented assignment saves a lookup and some object allocation. For numbers, this is insignificant; but if you happen to be working with multi-megabyte strings, use of augmented assignment can speed things up and reduce memory usage.


Garbage collection

Python's memory management is probably a pretty arcane issue for most day-to-day Python programmers. Traditionally, Python has used a reference-counting scheme to delete objects when they are no longer accessible from any name. However, a reference-counting methodology is theoretically prone to leaking memory if cyclic references are used in a program. For example, this code will break the reference counting:

Cyclical references in Python

>>> class MyClass: pass
....
>>> myobject = MyClass()
>>> myobject.me = myobject
>>> del myobject

At this point, it is impossible to access myobject, but it will not have been deleted, since the reference count was incremented twice, but only decremented once.

As bad as this might sound, most programmers will never experience any actual problems due to code like the above. In most cases, cyclical references will not be used in the first place, and even if they are, most times the memory leakage will be small (you can easily construct artificial cases of dangerous behavior; for example, add a myobject.big='#'*10**6 to the above example).

In any case, Python 2.0 adds a compile-time option for mark-and-sweep garbage collection (GC). Most distributions of Python 2.0 seem to be compiled using this option; but if you need to, you can compile your own Python version that turns off the garbage collection option. In either case, reference counting is still used; it is just a question of whether leaks like the above are cleaned up.

On some platforms, like embedded systems, GC may be undesirable. Garbage collection takes some CPU cycles (not a lot, but some). Perhaps more importantly, reference counting is determinate in program behavior, while garbage collection is not. That is, you never know for sure when a garbage collection will eat a few CPU cycles; therefore, using the GC version of Python will cause the identical program to behave differently (in terms of timings) from run to run.

These issues are fascinating theoretically, but most programmers should just ignore them from here on out. Whatever Python distribution you pick up will almost certainly do the right thing for the platform you are using; unless you know enough to know exactly why you want to enable or disable GC, I recommend not worrying about it.


Print direction

As good a job as van Rossum and the rest of the team have done with Python 2.0, they also introduced one wart in Python. It does something moderately useful, but in my opinion (and also in the opinion of many other Python programmers), it introduces a brand new (and ugly) syntactic feature where none is needed. Most programmers suspect this imperfection is merely a ruse, however, to make the simplicity and beauty of the rest of Python shine even more brightly.

The print statement performs a bit of magic that the .write() method of file objects does not (and sys.stdout is just another file object that print writes to). The print statement allows multiple arguments, each of any Python type. The trailing comma conveniently allows line continuation between print statements, while the default writes each bunch of stuff to its own line. Overall, print is just a handy way to get some information from a program to the console.

A lot of Python programmers have wanted that same print mojo to be available for writing to other file objects (such as sys.stderr, regular files, or "file-like" objects that various modules provide). I think the right way to do this would be to add a .print() method to file objects and do the magic there. But Python 2.0 adds this capability by adding the "redirection operator" >> to the print statement. For example:

The redirection operator in the print statement

>>> import sys
>>> print >> sys.stderr, "spam", [1,2,3], 45.2
spam [1, 2, 3] 45.2

This works -- and it adds a nice capability -- but it nudges Python just a hair closer to the "executable line-noise" feel of certain other programming languages.


Resources

About the author

Since conceptions without intuitions are empty, and intuitions without conceptions, blind, David Mertz wants a cast sculpture of Milton for his office. Start planning for his birthday. David may be reached at mertz@gnosis.cx; his life pored over at http://gnosis.cx/dW/. Suggestions and recommendations on this, past, or future, columns are welcomed.

Report abuse help

Report abuse

Thank you. This entry has been flagged for moderator attention.


Report abuse help

Report abuse

Report abuse submission failed. Please try again later.


developerWorks: Sign in

If you don't have an IBM ID and password, register here.


Forgot your IBM ID?


Forgot your password?
Change your password


By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. This profile includes the first name, last name, and display name you identified when you registered with developerWorks. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

(Must be between 3 – 31 characters.)


By clicking Submit, you agree to the developerWorks terms of use.

 


Rate this article

Comments

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Linux
ArticleID=11070
ArticleTitle=Charming Python: Getting version 2.0
publish-date=02012001
author1-email=mertz@gnosis.cx
author1-email-cc=

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

For articles in technology zones (such as Java technology, Linux, Open source, XML), Popular tags shows the top tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), Popular tags shows the top tags for just that product zone.

For articles in technology zones (such as Java technology, Linux, Open source, XML), My tags shows your tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), My tags shows your tags for just that product zone.

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).