The first installment in this series covers sequences and comparisons. This installment builds on those topics.
In most object-oriented languages, methods and attributes are almost, but not quite, the same thing. Both can be attached to a class and/or to an instance. Aside from the details of implementation, there is a key difference: attached to an object, methods are things you can call to initiate actions and calculations; attributes simply have values that can be retrieved (and perhaps modified).
For some languages (the Java™ language, for example), this distinction
is pretty much the end of the story. Attributes are attributes, methods
are methods. The Java language, by convention, puts a heavy emphasis on encapsulation and data
hiding; thus encouraging the use of "setters" and "getters" to
access otherwise private attribute data. To the Java way of thinking, using
explicit method calls covers in advance the case where you might want to add
computation or side-effects to data access or modification. Of course, the result
of the Java attitude is verbosity, and the imposition of a sometimes
artificial-seeming discipline: you wind up writing
foo.getBar() instead of
foo.bar, and
foo.setBar(value) instead of
foo.bar=value.
Ruby is worth mentioning as something of an odd creature in this regard. Ruby
actually insists on hiding data to an even greater degree than Java does:
all attributes are always "private"; you can never directly
access instance data. At the same time, Ruby uses some syntax conventions that
make method calls look like attribute access looks in other languages. The first
element of this is Ruby's optional parentheses in method calls; the second part is
its use of semi-special naming of methods with symbols that are operators in most
languages. So in Ruby, foo.bar is just a shorter option
for calling foo.bar(); and "setting"
foo.bar=value is shorthand for
foo.bar=(value). Behind the scenes, everything
goes through a method call.
Python is much more flexible than either Java or Ruby, but that fact is as much
a criticism as it is praise. If you access foo.bar, or
set foo.bar=value in Python, you might be using a
simple data value, or you might be calling some semi-hidden code. Moreover, in the
latter case, there are at least a half-dozen different ways you might reach that
block of code, each one having slightly different behavior from the others, with
dizzying subtleties and nuances. This deluge of options harms the regularity of
Python and makes it harder to understand for non-experts (and even for experts). I
know why things have gotten where they are: new capabilities have been added to
Python's object-oriented underpinnings in several steps. But I am not terribly
pleased by the chaos.
Since the old days (before Python 2.1), Python has had a magic method called
.__getattr__() that a class could define to return
computed values rather than simple data accesses. Correspondingly, the magic
methods .__setattr__() and
.__delattr__() could cause code to run when
"attributes" were set or deleted. The problem with this old system was that you
never really knew whether or not the code would actually be called, since it
depended on whether an attribute with the same name as the one accessed existed in
obj.__dict__. You could try to create
.__setattr__() and
.__delattr__() methods that controlled what wound up
there, but even that did not prevent direct manipulation of
obj.__dict__ by other code. Both changed inheritance
trees, and passing objects to external functions could often make it non-obvious
whether a method would or would not actually run when working with an object. For
example:
Listing 1. Will the method run?
>>> class Foo(object):
... def __getattr__(self, name):
... return "Value of %s" % name
>>> foo = Foo()
>>> foo.just_this = "Some value"
>>> foo.just_this
'Some value'
>>> foo.something_else
'Value of something_else'
|
Accessing foo.just_this skips the method code, while
accessing foo.something_else runs it; other than the
fact that this shell session is short, nothing makes the difference obvious. In
fact, asking the obvious hasattr() question gives you a
misleading answer:
Listing 2. Ambiguity using hasattr()
>>> hasattr(foo,'never_mentioned')
True
>>> foo2.__dict__.has_key('never_mentioned') # this works
False
>>> foo2.__dict__.has_key('just_this')
True
|
With Python 2.2, we gained a new mechanism for creating "restricted" classes.
Exactly what the new-style class _slots_ attribute is
supposed to be used for is nowhere made clear. For the most part, the Python
documentation advises to use .__slots__ only for
performance optimization in classes that might have a very large number of
instances -- but specifically not as a way to declare attributes.
Nonetheless, the latter is what slots do: they create a class without a
.__dict__ attribute that only has the attributes
explicitly named (methods, however, are still declared as normal within the class
body). It is peculiar, but this gives you a way to ensure that method code is
called on attribute access:
Listing 3. Ensuring method execution using .__slots__
>>> class Foo2(object):
... __slots__ = ('just_this')
... def __getattr__(self, name):
... return "Value of %s" % name
>>> foo2 = Foo2()
>>> foo2.just_this = "I'm slotted"
>>> foo2.just_this
"I'm slotted"
>>> foo2.something_else = "I'm not slotted"
AttributeError: 'Foo' object has no attribute 'something_else'
>>> foo2.something_else
'Value of something_else'
|
The declaration of .__slots__ guarantees that only
those attributes you specify can be accessed directly; everything else will go
through the .__getattr__() call. If you have also
created a .__setattr__() method, you can make an
assignment do something other than raise an
AttributeError (but be sure to let the "slotted" value
pass through on assignment). For example:
Listing 4. Using .__setattr__ along with .__slots__
>>> class Foo3(object):
... __slots__ = ('x')
... def __setattr__(self, name, val):
... if name in Foo.__slots__:
... object.__setattr__(self, name, val)
... def __getattr__(self, name):
... return "Value of %s" % name
...
>>> foo3 = Foo3()
>>> foo3.x
'Value of x'
>>> foo3.x = 'x'
>>> foo3.x
'x'
>>> foo3.y
'Value of y'
>>> foo3.y = 'y' # Doesn't do anything, but doesn't raise exception
>>> foo3.y
'Value of y'
|
The .__getattribute__()
method
In Python 2.2 and later, you have the option of using the method
.__getattribute__() instead of the similarly and
confusingly named, old-style .__getattr__(). Well, you
have that option if you use new-style classes (which you generally should anyway).
The .__getattribute__() method is more powerful than
its sibling in that it intercepts all attribute access, whether or not an
attribute is defined in obj.__dict__ or
obj.__slots__. A drawback of using
.__getattribute__() is that all access goes though the
method. If you use this method, a bit of special programming is needed if you want
to return (or manipulate) the "real" value of the attribute: usually you do this
by calling .__getattribute__() on the superclass
(usually object). For example:
Listing 5. Returning a "real" .__getattribute__ value
>>> class Foo4(object):
... def __getattribute__(self, name):
... try:
... return object.__getattribute__(self, name)
... except:
... return "Value of %s" % name
...
>>> foo4 = Foo4()
>>> foo4.x = 'x'
>>> foo4.x
'x'
>>> foo4.y
'Value of y'
|
In all versions of Python, .__setattr__() and
.__delattr__() also intercept all the write and delete
access to attributes, rather than merely those absent from
obj.__dict__.
We are moving along nicely in an enumeration of ways to make attributes act like methods. Within these magic methods, you can examine the specific attribute name being accessed, assigned, or deleted. In fact, if you like, you can check names by regular expression or by other computation. In principle, you can make all sorts of runtime decisions about how to handle the use of some given pseudo-attribute. For example, perhaps you do not simply want to compare the attribute name to a string pattern, but actually look up whether the name is an attribute that has been stored in a persistent database.
Much of the time, however, you would just like for a few "attributes" to act in a special manner but let other attributes operate as plain attributes. The plain attributes should neither trigger any special code nor suffer the time penalty of working through method code. In these cases, you can use descriptors for attributes. Or, you can define properties, closely related to descriptors. Behind the scenes, properties and descriptors amount to the same thing, but the syntax of defining them is rather different. And with the difference in definition styles, as you might have guessed, you get advantages and disadvantages.
Let's first look at a descriptor. The idea here is to assign an instance of a
special kind of class to an attribute within another class. This special
"descriptor" class is a new-style class that contains methods called
.__get__(), .__set__(), and
__delete__() (or at least a subset of those). If the
descriptor class implements at least the first two, it is called a "data
descriptor"; if it implements only the first, it is called a "non-data
descriptor."
The non-data descriptor is likely to be used to return a callable object. A non-data descriptor is, in a sense, often a fancy name for a method -- but the particular method returned by descriptor access could be determined at runtime. This starts to get into the scary realm of things similar to metaclasses and decorators, which I have written about before in this column (see Resources for links). Of course, a regular method can also decide what code to run based on runtime conditions, so there is nothing fundamentally new about this concept of runtime determination of what a "method" does.
In any case, a data descriptor is more general, so I'll show you one. Such a descriptor could return something callable -- any Python function or method can return anything, after all. But our example just deals with simple values (and side effects). We want to make any use of a few attributes log the action to STDERR:
Listing 6. An example data descriptor
>>> class ErrWriter(object):
... def __get__(self, obj, type=None):
... print >> sys.stderr, "get", self, obj, type
... return self.data
... def __set__(self, obj, value):
... print >> sys.stderr, "set", self, obj, value
... self.data = value
... def __delete__(self, obj):
... print >> sys.stderr, "delete", self, obj
... del self.data
>>> class Foo(object):
... this = ErrWriter()
... that = ErrWriter()
... other = 4
>>> foo = Foo()
>>> foo.this = 5
set <__main__.ErrWriter object at 0x5cec90>
<__main__.Foo object at 0x5cebf0> 5
>>> print foo.this
get <__main__.ErrWriter object at 0x5cec90>
<__main__.Foo object at 0x5cebf0> <class '__main__.Foo'>
5
>>> print foo.other
4
>>> foo.other = 6
>>> print foo.other
6
|
The class Foo defines this
and that as descriptors of the
ErrWriter class. The attribute
other is just a plain class attribute. Actually, there
is a caveat here. On first access to foo.other, we read
the class attribute; after we assign a value, we actually read the instance
attribute instead. The class attribute is still there, just hidden, i.e.:
Listing 7. The class attribute vs. the instance attribute
>>> foo.other
6
>>> foo.__class__.other
4
|
In contrast, the descriptor remains a class-level object, even though you can access it through the instance. This has the usually undesirable effect of making the descriptor something like a singleton. For example:
Listing 8. The descriptor as a singleton
>>> foo2 = Foo()
>>> foo2.this
get <__main__.ErrWriter object at 0x5cec90>
<__main__.Foo object at 0x5cebf0> <class '__main__.Foo'>
5
|
To simulate a usual "per instance" behavior, you would need to make use of the
obj passed into ErrWriter
magic methods. This obj is the instance that has the
descriptor. So you might define a non-singleton descriptor like:
Listing 9. Defining a non-singleton descriptor
class ErrWriter(object):
def __init__(self):
self.inst = {}
def __get__(self, obj, type=None):
return self.inst[obj]
def __set__(self, obj, value):
self.inst[obj] = value
def __delete__(self, obj):
del self.inst[obj]
|
Properties work like descriptors, but are generally defined inside a particular
class rather than being created as "utility descriptors" that various classes
might utilize. As with "regular" descriptors, the idea is to define "getters,"
"setters," and "deleters". After that, you use the special function
property() to turn those methods into a descriptor. For
those readers who are paying a bit closer attention:
property is not really a function but a type -- don't
worry about it.
Oddly, properties bring us full circle to the brief description I gave at top about how the Ruby programming languge works. A property is just a thing that looks like an attribute syntactically (as used), but is defined by defining all the getters, setters, and so on. If you wanted to, you could impose complete "Ruby-discipline" in Python, and never access "real" attributes. More likely, you will want to mix-and-match though. Here's how a property works:
Listing 10. How properties work
class FooP(object):
def getX(self): return self.__x
def setX(self, value): self.__x = value
def delX(self): del self.__x
x = property(getX, setX, delX, "I'm the 'x' property.")
|
The names of the getter, setter, and deleter are nothing reserved. Usually you will want to use sensible names like the above. What they actually do can be anything, but often it is reasonable to use double-underscore versions of names for the attributes. These attributes get attached to the instance, just with the usual Python name mangling for "semi-hidden" attributes. Moreover, the methods remain perfectly usable too:
Listing 11. Using methods
>>> foop = FooP()
>>> foop.x = 'FooP x'
>>> foop.getX()
'FooP x'
>>> foop._FooP__x
'FooP x'
>>> foop.x
'FooP x'
|
In this installment I have shown far too many ways to make Python instance attributes act like (or be) method calls, but I really do not have any clear advice on how to cut through the complexity. I'd like to be able to tell you simply to choose one of the described techniques and ignore the others as inferior or less general. Unfortunately, each technique has both strengths and weaknesses. Each is pretty reasonable for certain programming contexts, even though the syntax and semantics of each is so radically different from the others.
Moreover, while I have not described it in this article, I have thought (vaguely) of a number of other even more obscure ways that a programmer might use metaclasses, class factories, and decorators to obtain similar effects to the "standard" half-dozen techniques I have outlined. Those ideas would truly probe into some dark corners of Python metaprogramming.
It would be nice if all the things I described were possible, but the variations among them were simply parameterized in some straightforward way rather than using wholly different syntax and organization. The grand goal of Python 3000 is a simplification along these lines; but I have not seen any concrete proposals on how such unification and simplification of attributes-as-methods might work. One idea that occurs to me is that Python might enable decorators for classes (along with the current use in methods and functions), and also provide some standard module of decorators for the most common "magic attribute" behaviors. This is speculation, and I do not know exactly how it might work, but I can just imagine such a thing could hide the complexity from the 95% of Python programmers who really do not wish to worry too much about Python internals and cryptic mojo.
Learn
- Read
first installment
in this "Python elegance and warts" series, which covers sequences and comparisons.
- Raymond Hettinger's
"How-To Guide for Descriptors"
is an excellent, albeit fairly dense, description of the low-down on exactly how
descriptors work.
- The
Python PEP 3000describes
the goals for Python 3.0 (also called "Python 3000"). The main purpose of this
future version is removal of redundancy in programming techniques. Nothing is
currently detailed about reducing the chaos about attributes-as-methods, but a
programmer can dream.
- Read
David's
Metaclass
programming in Python series for a primer on metaclass programming.
- "Charming Python:
Decorators make magic easy" (developerWorks, Dec 2006)
introduces decorators, which offer an easier way to perform most metaprogramming
in Python.
- Read more of David's
Charming
Python articles
on developerWorks.
- In the
developerWorks Linux zone,
find more resources for Linux developers, including
Linux tutorials,
as well as
our readers' favorite Linux articles and tutorials
over the last month.
- Stay current with
developerWorks technical events and Webcasts.
Get products and technologies
-
Order the SEK for Linux,
a two-DVD set containing the latest IBM trial software for Linux from DB2®,
Lotus®, Rational®, Tivoli®, and WebSphere®.
- With
IBM trial software,
available for download directly from developerWorks, build your next development
project on Linux.
Discuss
- Get involved in the
developerWorks community
through our developer blogs, forums, podcasts, and community topics in our new
developerWorks spaces.

David Mertz has been writing the developerWorks columns Charming Python and XML Matters since 2000. Check out his book Text Processing in Python . For more on David, see his personal Web page.
Comments (Undergoing maintenance)





