Charming Python: Multiple dispatch

Generalizing polymorphism with multimethods

Object-oriented programming gains much of its versatility through polymorphism: objects of different kinds can behave in similar ways, given the right contexts. But most OOP programming is single dispatch; that is, just one designated object determines which code path is taken. Conceptually, a more general technique is to allow all the arguments to a function/method to determine its specialization. This article presents an implementation of multiple dispatch in Python, and shows examples where this makes for better programs.

Share:

David Mertz, Developer, Gnosis Software, Inc.

David Mertz feels that programmers with multiple personality syndrome will want all of their functions to be generic. You can reach David at mertz@gnosis.cx; you can investigate all aspects of his life at his personal Web page. Suggestions and recommendations on this, past, or future columns are welcome.



20 March 2003

Also available in Russian Japanese

What is polymorphism?

Most programmers -- in Python or other object-oriented programming languages -- who use polymorphism, do so in a rather practical and concrete way. Perhaps the most common application of polymorphism is in creating a family of objects that follow a common protocol. In Python, this usually is simply a matter of ad-hoc polymorphism; in other languages, formal interfaces are more often declared and/or these families share a common ancestor.

For example, there are many functions that operate on "file-like" objects, where file-like is defined simply by supporting a few methods like .read(), .readlines(), and maybe .seek(). A function like read_app_data() might take an argument src -- when we call the function, we might decide to pass it a local file, a urllib object, a cStringIO object, or some custom object that lets the function call src.read(). Each object type is interchangeable from the point of view of how it functions within read_app_data().

Let's step back a bit to think about what is really happening here. At heart, we are concerned about choosing the right code path to execute within a context; old-fashioned procedural code can make equivalent decisions, OOP merely adds some elegance. For example, a fragment of procedural (pseudo-)code might look like this:

Listing 1. Procedural choice of code paths on object type
...bind 'src' in some manner...
if <<src is a file object>>:
    read_from_file(src)
elif <<src is a urllib object>>:
    read_from_url(src)
elif <<src is a stringio object>>:
    read_from_stringio(src)
...etc...

By arranging for objects of different types to implement common methods, we move the dispatch decision into the objects and out of an explicit conditional block. A given src object knows what block of code it needs to call by looking through its inheritance tree. There is still an implicit switch going on, but it is on the type of the object src.

The object src is privileged over any arguments passed to its methods. OOP syntax makes this privileging seem inevitable, but it is not really. Procedural switching is simply pushed into the method bodies of classes in many cases. For example, we might implement protocol-compatible classes Foo and Bar as follows (in pseudo-Python):

Listing 2. Foo and bar implement the .meth() method
class Foo:
    def meth(self, arg):
        if <<arg is a Foo>>:
            ...FooFoo code block...
        elif <<arg is a Bar>>:
            ...FooBar code block...
class Bar:
    def meth(self, arg):
        if <<arg is a Foo>>:
            ...BarFoo code block...
        elif <<arg is a Bar>>:
            ...BarBar code block...
# Function to utilize Foo/Bar single-dispatch polymorphism
def x_with_y(x, y):
    if <<x is Foo or Bar>> and <<y is Foo or Bar>>:
        x.meth(y)
    else:
        raise TypeError,"x, y must be either Foo's or Bar's"

There are five distinct code paths/blocks that might get executed when x_with_y() is called. If the types of x and y are not suitable, an exception is raised (you could also do something different, of course). But assuming the types are okay, the code path is chosen first by a polymorphic dispatch, and second by procedural switch. Moreover, the switches within the definitions of Foo.meth() and Bar.meth() are largely equivalent. Polymorphism -- of the single-dispatch variety -- only goes half way.


Completing polymorphism

In single-dispatch polymorphism, the object that "owns" a method is singled out. Syntactically, it is singled out in Python by being named before the dot -- everything following the dot, method name, and left parenthesis is just an argument. But semantically, the object is also special in utilizing an inheritance tree for method resolution.

What if we did not treat just one object in a special fashion, but allowed every object involved in a code block to help choose the correct code path? For example, we might express our five-way switch in a more symmetric fashion:

Listing 3. Multiple dispatch on Foo and Bar
x_with_y = Dispatch([((object, object), <<exception block>>)])
x_with_y.add_rule((Foo,Foo), <<FooFoo block>>)
x_with_y.add_rule((Foo,Bar), <<FooBar block>>)
x_with_y.add_rule((Bar,Foo), <<BarFoo block>>)
x_with_y.add_rule((Bar,Bar), <<BarBar block>>)
#...call the function x_with_y() using some arguments...
x_with_y(something, otherthing)

I think this symmetry in polymorphic dispatch on multiple arguments is much more elegant than is the prior style. Also, the style helps document the equal role of the two objects involved in determining the appropriate code path to take.

Standard Python does not let you configure this type of multiple dispatch; but fortunately, you can do so using the module multimethods that I have written. See Resources to download the module by itself or as part of Gnosis Utilities. Once you have installed multimethods, just include the following line at the top of your application:

from multimethods import Dispatch

"Multimethods" is generally a synonym for multiple dispatch; but the name multimethod suggests the concrete function/object handling the more abstract concept multiple dispatch.

An instance of Dispatch is a callable object and can be configured with as many rules as you wish. The method Dispatch.remove_rule() can be used to delete rules as well, which makes multiple dispatch using multimethods a bit more dynamic than is a static class hierarchy (but you can also do some arcane things with Python classes at runtime). Note also that a Dispatch instance can accept a variable number of arguments; matching is done first on number of arguments, then on their types. If a Dispatch instance is called with any pattern that is not defined in a rule, a TypeError is raised. The initialization of x_with_y() with a fallback (object,object) pattern is not necessary if you simply want undefined cases to raise an exception.

Each (pattern,function) tuple that is listed in the initialization call to Dispatch is simply passed on to the .add_rule() method; it is solely a matter of programmer convenience whether to establish rules on initialization or at a later point (you can mix-and-match, as in the prior example). When a function is called from the dispatcher, it is passed the arguments used in the call to the dispatcher; you need to make sure the function you use can accept the number of arguments it is matched against. For example, the following are equivalent:

Listing 4. Explicit and dispatched function call
# Define function, classes, objects
def func(a,b): print "The X is", a, "the Y is", b
class X(object): pass
class Y(object): pass
x, y = X(), Y()

# Explicit call to func with args
func(x,y)

# Dispatched call to func on args
from multimethods import Dispatch
dispatch = Dispatch()
dispatch.add_rule((X,Y), func)
dispatch(x,y)         # resolves to 'func(x,y)'

Obviously, if you already know the types of x and y at design time, the machinery of setting up a dispatcher is just overhead. But then, the same limitation is true of polymorphism -- it is only helpful when you cannot constrain an object to a single type for every execution path.


Improving inheritance

Multiple dispatch does not merely generalize polymorphism, it also provides a more flexible alternative to inheritance in many contexts. An example is illustrative here. Suppose you are programming a drawing or CAD program that deals with a variety of shapes; in particular, you want to be able to combine two shapes in a way that depends on both of the shapes involved. Moreover, the collection of shapes to consider will be extended by derived applications or plug-ins. Extending a collection of shape classes provides a clumsy technique for enhancement; for example:

Listing 5. Inheritance for capability extension
# Base classes
class Circle(Shape):
    def combine_with_circle(self, circle): ...
    def combine_with_square(self, square): ...
class Square(Shape):
    def combine_with_circle(self, circle): ...
    def combine_with_square(self, square): ...
# Enhancing base with triangle shape
class Triangle(Shape):
    def combine_with_circle(self, circle): ...
    def combine_with_square(self, square): ...
    def combine_with_triangle(self, triangle): ...
class NewCircle(Circle):
    def combine_with_triangle(self, triangle): ...
class NewSquare(Square):
    def combine_with_triangle(self, triangle): ...
# Can optionally use original class names in new context
Circle, Square = NewCircle, NewSquare
# Use the classes in application
c, t, s = Circle(...), Triangle(...), Square(...)
newshape1 = c.combine_with_triangle(t)
newshape2 = s.combine_with_circle(c)
# discover 'x' of unknown type, then combine with 't'
if isinstance(x, Triangle): new3 = t.combine_with_triangle(x)
elif isinstance(x, Square): new3 = t.combine_with_square(x)
elif isinstance(x, Circle): new3 = t.combine_with_circle(x)

In particular, each existing shape class has to add capabilities in a descendent, which runs into combinatorial complexities and difficulties in maintenance.

In contrast, a multiple dispatch technique is more straightforward:

Listing 6. Multimethods for capability extension
# Base rules (stipulate combination is order independent)
class Circle(Shape): pass
class Square(Shape): pass
def circle_with_square(circle, square): ...
def circle_with_circle(circle, circle): ...
def square_with_square(square, square): ...
combine = Dispatch()
combine.add_rule((Circle, Square), circle_with_square)
combine.add_rule((Circle, Circle), circle_with_circle)
combine.add_rule((Square, Square), square_with_square)
combine.add_rule((Square, Circle),
                 lambda s,c: circle_with_square(c,s))
# Enhancing base with triangle shape
class Triangle(Shape): pass
def triangle_with_circle(triangle_with_circle): ...
def triangle_with_square(triangle_with_square): ...
combine.add_rule((Triangle,Circle), triangle_with_circle)
combine.add_rule((Triangle,Square), triangle_with_square)
combine.add_rule((Circle,Triangle),
                 lambda c,t: triangle_with_circle(t,c))
combine.add_rule((Square,Triangle),
                 lambda s,t: triangle_with_square(t,s))
# Use the rules in application
c, t, s = Circle(...), Triangle(...), Square(...)
newshape1 = combine(c, t)[0]
newshape2 = combine(s, c)[0]
# discover 'x' of unknown type, then combine with 't'
newshape3 = combine(t, x)[0]

The definition of new rules (and support functions/methods) is largely equivalent. But the huge advantage of the multiple dispatch style is in the seamlessness with which you can combine shapes of unknown types. Rather than revert back to explicit (and lengthy) conditional blocks, the rule definitions take care of matters automatically. Even better, all combining is done with a single combine() callable rather than with a menagerie of distinct combination methods.


Dispatch propagation

Without needing to think about dispatch further, the multimethods.Dispatch class will select the "best fit" for a given call to a dispatcher. However, it is sometimes worth noticing that "best" is not "only." That is, a call to dispatch(foo,bar) might be a close fit with a defined rule (Foo,Bar) -- but it might also be a loose (rather than non-) fit with (FooParent,BarParent). Just as you sometimes want to call on superclass methods in an inherited method, you also sometimes want to call on less specific rules within a dispatcher.

The multimethods module gives you both a quick way of calling less specific rules, and a more fine-tuned way. At a rough level, you usually just want to automatically call a less specific rule at either the start or the end of execution of a code block. Likewise, you almost always call a superclass method at either the start or end of a descendent method body. For a generic start/end call to less specific methods, you can just specify that as part of the rule. For example:

Listing 7. Automatic dispatch propagation
class General(object): pass
class Between(General): pass
class Specific(Between): pass
dispatch = Dispatch()
dispatch.add_rule((General,), lambda _:"Gen", AT_END)
dispatch.add_rule((Between,), lambda _:"Betw", AT_END)
dispatch.add_rule((Specific,), lambda _:"Specif", AT_END)
dispatch(General())  # Result: ['Gen']
dispatch(Specific()) # Result: ['Specif', 'Betw', 'Gen']

Of course, in some cases (like the (General) rule), there is nothing less specific available in the defined rules. For uniformity, however, every call to a dispatcher returns a list of return values from all functions that control what was propagated to. If neither AT_END nor AT_START is specified in the rules, no propagation occurs (and the returned list has length one). This explains the [0] indexing that probably looked mysterious in the shape example.

The fine-tuned way of propagating control is with the .next_method() method of a dispatcher. In order to utilize manual propagation, you should define rules using the .add_dispatchable() method rather than the .add_rule() method. As well, the dispatched functions themselves should accept a dispatch argument. The call to the dispatcher either needs a dispatch argument, or you can use the .with_dispatch() convenience method. For example:

Listing 8. Programming with manual propagation
def do_between(x, dispatch):
  print "do some initial stuff"
  val = dispatch.next_method() # return simple value of up-call
  print "do some followup stuff"
  return "My return value"
foo = Foo()
import multimethods
multi = multimethods.Dispatch()
multi.add_dispatchable((Foo,), do_between)
multi.with_dispatch(foo)
# Or: multi(foo, multi)

Manual propagation to less specific multimethods can get tricky in many of the same ways that calls to superclass methods can get tricky. To make things tractable, calls to .next_method() always return the simple return value of the up-call -- if you want to assemble such return values into a list like the AT_END argument does, you will need to append and manipulate the values as you think appropriate. The most common "use case," however, is where a series of related initializations are performed; in this case, the return values are usually irrelevant.


Note on thread safety

A quick interjection is worthwhile lest readers run into a problem. Because of the stateful way propagation tracks which (successively less specific) rules have been called, a dispatcher is not thread safe. If you wish to use a dispatcher in multiple threads, you should "clone" it for each thread. Doing so is inexpensive in memory and CPU resources, so there is no significant penalty for cloning dispatchers. For example, suppose a function might be called across thread; you can write:

Listing 9. Cloning for thread safety
def threadable_dispatch(dispatcher, other, arguments)
    dispatcher = dispatcher.clone()
    #...do setup activities...
    dispatcher(some, rule, pattern)
    #...do other stuff...

If no new threads are spawned within threadable_dispatch(), all is well.

The idea of multiple dispatch takes a little while to get your head around, even -- or especially -- if you are well-versed in object-oriented programming. But after you play with it for a little while, you are likely to find that multiple dispatch generalizes and enhances the benefits that OOP has over procedural programming in the first place.

Resources

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Linux on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Linux
ArticleID=11305
ArticleTitle=Charming Python: Multiple dispatch
publish-date=03202003