Discover Python, Part 2: Explore the Python type hierarchy

Understanding objects and containers

The Python programming language is a simple yet powerful language. This article explores the object nature of the language, initially for the built-in simple types. The Python tuple class is also introduced and used to demonstrate the concept of a container type.

Share:

Robert Brunner, NCSA Research Scientist, Assistant Professor of Astronomy, University of Illinois, Urbana-Champaign

Robert J. BrunnerRobert J. Brunner is a Research Scientist at the National Center for Supercomputing Applications and an Assistant Professor of Astronomy at the University of Illinois, Urbana-Champaign. He has published several books and a number of articles and tutorials on a range of topics. You can reach him at rb@ncsa.uiuc.edu.



31 May 2005

In the Python language, everything is an object a program can access, including simple types that hold integers, as well as the actual code you write and its representation in the Python interpreter. For someone familiar with another programming language, this behavior might seem to be a recipe for confusion. But in practice, that's not the case. Python has a well-defined type, or object, hierarchy. This hierarchy can be conceptually broken down into four categories: simple types, container types, code types, and internal types. These four categories, and the simple types themselves, were introduced in the first article in this series, "Discover Python, Part 1: Python's built-in numerical types." This article reviews the simple built-in data types you can use in Python, this time emphasizing the object nature of these types. Then we introduce the concept of a container type and focus on the Python tuple class as the first example of that type.

The simple types

The simple data types that are built in to the Python programming language include:

  • bool
  • int
  • float
  • complex

Supporting simple data types is not unique to Python, as most modern programming languages have full type complements. For example, the Java™ language has an even richer set of primitive data types:

  • byte
  • short
  • int
  • long
  • float
  • double
  • char
  • boolean

In Python, however, the simple data types are not primitives but full-fledged objects, with their own methods and classes. In addition, these simple built-in types are immutable, which means you can't change an object's value after the object has been created. If a new value is needed, you must create a new object. The immutable nature of Python simple data types is different from how most other popular languages (like the Java language) treat simple primitive types. But this difference is easy to understand when you have a better comprehension of the object nature of these simple data types.

So, how can an integer have methods? It's just a number, right? Well, the answer is no, at least in Python. Check it out for yourself: Just ask the Python interpreter for information about the int object by using the built-in help method (see Listing 1).

Listing 1. Python interpreter: Help for integers
rb% python
Python 2.4 (#1, Mar 29 2005, 12:05:39) 
[GCC 3.3 20030304ppp(Apple Computer, Inc. build 1495)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> help(int)

Help on class int in module __builtin__:

class int(object)
 |  int(x[, base]) -> integer
 |  
 |  Convert a string or number to an integer, if possible.  A floating point
 |  argument will be truncated towards zero (this does not include a string
 |  representation of a floating point number!)  When converting a string, use
 |  the optional base.  It is an error to supply a base when converting a
 |  non-string. If the argument is outside the integer range a long object
 |  will be returned instead.
 |  
 |  Methods defined here:
 |  
 |  __abs__(...)
 |      x.__abs__() <==> abs(x)
 |  
 |  __add__(...)
 |      x.__add__(y) <==> x+y
...

So, what exactly does this show? For one thing, that it's easy to get help from the Python interpreter, but more on that later. The first line tells you that you're seeing the help page for the int class, which is a built-in data type. If you aren't familiar with the concepts of object-oriented programming, a class is simply a blueprint for building and interacting with a particular thing. A good analogy is the blueprint for a house, which shows you not only how to build the house but also how to interact with the house after it's built. For example, a blueprint shows the locations of different rooms, how to move between them, and the doors into and out of the house.

Following this first line is a detailed description of the actual int class. At this point, you're probably not familiar with how to create a class in Python, so the syntax displayed is probably akin to a foreign language. That's OK: I'll cover it completely in another article. For now, you merely need to know that the int object inherits from the object class, which is a base class for a lot of things in Python.

The next few lines introduce the int class constructor. A constructor is just a special method that creates an instance (or object) of a particular class. A good analogy for a constructor method is a building contractor, who takes the blueprints for your house and builds it. In Python, a constructor has the same name as the class it creates. A class can have different constructor methods, differentiated by the different attributes enclosed within the parentheses following the class name. A good example of how a class can have different constructor methods is the int class, which you can actually invoke in several ways, depending on which parameters you place within the parentheses (see Listing 2).

Listing 2. Python Interpreter: The int class constructor
>>> int()
0
>>> int(100)          # Create an integer with the value of 100
100
>>> int("100", 10)    # Create an integer with the value of 100 in base 10
100
>>> int("100", 8)     # Create an integer with the value of 100 in base 8
64

These four constructor calls create four different integers. The first one creates an integer object whose value is 0, which is the default value used when no value is supplied to the int class constructor. The next constructor creates an integer with the value of 100, as specified. The third constructor takes the string of characters "100" and creates an integer in base 10 (the familiar decimal system). And the final constructor takes the string of characters "100" -- but this time creates an integer using base 8, which is more commonly known as base octal. When the value is printed, however, it is converted back to base 10, which is why the number is displayed as 64.

You might be wondering what happens if you omit the parentheses from the constructor call. In this case, you assign the actual class name to the variable, effectively creating an alias to the original class (see Listing 3).

Listing 3. Python Interpreter: The int type
>>> it = int         # Create an alias to the integer class
>>> it(100)
100
>>> type(it)         #  We created a new type
<type 'type'>
>>> type(it(100))    # Our new type just makes integers
<type 'int'>

Now that was cool! You just created a new data type defined by the built-in int class. Beware the dark side, though: You shouldn't abuse this newfound power. Good programmers strive for code clarity in addition to code performance. This sort of coding trick does have its uses, but they aren't very common.

Using the Python interpreter simplifies the learning curve for new Python programmers. If you want to know more about the help facilities within Python, just enter help() at a command prompt in the Python interpreter to access the interactive help utility (see Listing 4).

Listing 4. Python Interpreter: The help interpreter
>>> help()

Welcome to Python 2.4!  This is the online help utility.

If this is your first time using Python, you should definitely check out
the tutorial on the Internet at http://www.python.org/doc/tut/.

Enter the name of any module, keyword, or topic to get help on writing
Python programs and using Python modules.  To quit this help utility and
return to the interpreter, just type "quit".

To get a list of available modules, keywords, or topics, type "modules",
"keywords", or "topics".  Each module also comes with a one-line summary
of what it does; to list the modules whose summaries contain a given word
such as "spam", type "modules spam".

help>

You've probably already figured this out, but entering int at the help> prompt displays the class description shown earlier for the int class.


The container types

So far, I've talked a lot about the simple types in the Python language. But most programs are not simple and involve complex data often composed of simple types. So the question now becomes, "How do you handle complex data in Python?"

If you are familiar with an object-oriented language, like Java or C#, you probably thought that question could be easily answered: Just create a new class to handle the complex data. This answer is also true in Python because Python supports the creation of new types via classes. But for many cases, a simpler approach in Python is also possible. When your program needs to handle several objects at once, you can utilize the Python container classes:

  • tuple
  • string
  • unicode
  • list
  • set
  • frozenset
  • dictionary

These container types provide two capabilities. The first six types are sequences, while the last type, dictionary, is a map. The difference between a sequence type and a map type is simple. A sequence type is just a sequence of objects. All the sequence types, except the set and frozenset types, support accessing the objects in the sequence given their order. In contrast, a map container holds objects for which the order is not important; a value is extracted from the container by providing a key that locates the value of interest.

Another difference between the container types results from the nature of the data they hold. The following four container types are immutable sequences:

  • tuple
  • string
  • unicode
  • frozenset

This means that when you create one of these container types, the data being stored can't be changed. If you need to change the data for some reason, you need to create a new container to hold the new data.

The last three container types (list, set, and dictionary) are all mutable containers, so any data they hold can be changed as needed (although the keys used in a dictionary are immutable, just like the keys to your house). While mutable containers are very flexible, their dynamic nature can result in a performance hit. For example, the tuple type, while less flexible because it's immutable, is generally much faster than the list type when used in identical situations.

These container classes provide a great deal of power and are often central to most Python programs. The rest of this article discusses the tuple type, which serves to introduce many of the fundamental concepts related to creating and using container types in Python. I'll discuss the remaining types in future articles.

The tuple

The tuple type is like a bag into which you throw everything you might need before you head out the door. You might throw your keys, driver's license, a pad of paper, and a pen into a bag, making your bag a collection of diverse items. The Python tuple type behaves in a similar manner as a bag in that it can hold different types of objects. You can create a tuple by simply assigning a sequence of objects, separated by commas, to a variable (see Listing 5).

Listing 5. Python Interpreter: Creating a tuple
>>> t = (0,1,2,3,4,5,6,7,8,9)
>>> type(t)
<type 'tuple'>
>>> t
(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)
>>> tt = 0,1,2,3,4,5,6,7,8,9
>>> type(tt)
<type 'tuple'>
>>> tt
(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)
>>> tc=tuple((0,1,2,3,4,5,6,7,8,9))
>>> tc
(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)
>>> et = ()     # An empty tuple
>>> et
()
>>> st = (1,)   # A single item tuple
>>> st
(1,)

This sample code shows how to create a tuple in several ways. The first method creates a tuple that contains the sequence of integers from 0 to 9. The second method does the same, but this time the enclosing parentheses are omitted. When creating a tuple, the enclosing parentheses are often optional, but sometimes required, depending on the context. As a result, you should get in the habit of including the enclosing parentheses to minimize confusion. The final tuple, tc, used the actual class constructor to create a tuple. The important point here is that the constructor form takes only one argument, so you have to enclose the object sequence within parentheses. The last two constructor calls demonstrate how to create an empty tuple (et) by placing nothing within the enclosing parentheses, and a tuple with only one item (st) by placing a single comma after the only item in the sequence.

One of the main reasons you use a bag to carry your items around is to simplify your life. But you still need to be able to access the items in your bag quickly when you need them. Most of the container types in Python, including the tuple, allow you to access items easily from the collection using the square bracket operators. But Python is even more flexible than other languages: You can select one item or multiple sequential items using what is commonly known as slicing (see Listing 6).

Listing 6. Python Interpreter: Accessing items from a tuple
>>> t = (0,1,2,3,4,5,6,7,8,9)
>>> t[2]
2
>>> type(t[2])
<type 'int'>
>>> t[0], t[1], t[9]
(0, 1, 9)
>>> t[2:7]            # Slice out five elements from the tuple
(2, 3, 4, 5, 6)
>>> type(t[2:7])
<type 'tuple'>
>>> t[2:7:2]          # Slice out three elements from the tuple
(2, 4, 6)

After creating a simple tuple, the previous example shows how to select a single item -- in this case, the integer 2. At this point, notice that Python uses zero ordering, where the items in a collection are numbered starting with zero. If you are familiar with programming in the Java language, C#, or another language descended from C, this behavior should be familiar. If not, the concept is simple. The index used to access items just states how far past the first item in the collection, or sequence, you need to go to get what you want. So, to get the third item (in this case, the integer 2), you need to go two items past the first one. When you access the third item, Python knows it is an integer object. You can also easily extract multiple items from the collection. In this case, you create a new tuple whose values are the first, second, and 10th values from the initial tuple.

The rest of the examples show how to select multiple items from the sequence at once using Python's slicing functionality. The term slicing refers to the way in which you are slicing items from the sequence. Slicing works by stating the starting index, the ending index, and an optional step size, all separated by semicolons. So, t[2:7] slices the third through the seventh items from the tuple, while t[2:7:2] slices every other item, starting with the third item and continuing through the seventh item from the tuple.

The tuple objects that I've created so far were homogeneous, in that they contained only integer objects. Fortunately, a tuple is more flexible than these examples have shown, as a tuple is actually a heterogeneous container (see Listing 7).

Listing 7. Python Interpreter: A heterogeneous tuple
>>> t = (0,1,"two",3.0, "four", (5, 6))
>>> t
(0, 1, 'two', 3.0, 'four', (5, 6))
>>> t[1:4]
(1, 'two', 3.0)
>>> type(t[2]) 
<type 'str'>
>>> type(t[3])
<type 'float'>
>>> type(t[5])
<type 'tuple'>
>>> t[5] = (0,1)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: object does not support item assignment

See how easy it is to create a tuple that can hold all kinds of items, including another tuple? And you access all the items in the same way using the square bracket operator, which supports slicing different types of sequential items. A tuple, however, is immutable. So when I tried to change the fifth element, I was told that item assignment is not allowed. In a simple analogy, after you put something into your bag, the only way to change what you take is to get out a new bag and put all your items into it.

If you need to create a new tuple that contains a subset of the items in an existing tuple, the simplest technique is to use the relevant slices and add the subsets together as necessary (see Listing 8).

Listing 8. Python Interpreter: Working with a tuple
>>> tn = t[1:3] + t[3:6]  # Add two tuples
>>> tn
(1, 'two', 3.0, 'four', (5, 6))
>>> tn = t[1:3] + t[3:6] + (7,8,9,"ten")
>>> tn
(1, 'two', 3.0, 'four', (5, 6), 7, 8, 9, 'ten')
>>> t2 = tn[:]            # Duplicate an entire tuple, a full slice
>>> t2
(1, 'two', 3.0, 'four', (5, 6), 7, 8, 9, 'ten')
>>> len(tn)               # Find out how many items are in the tuple
9  
>>> tn[4][0]              # Access a nested tuple
5

You can also combine slices of an existing tuple with a new tuple. Using the slice syntax, without specifying a starting or ending index, you can make a duplicate copy of an existing tuple. The last two examples are also interesting. The built-in len method tells you the number of items in the tuple. Accessing an item from within a nested tuple is also straightforward: You select the nested tuple, then you access the item of interest from it.

You also can create a tuple from a set of existing variables in a process called packing. The opposite is also true, where the values in a tuple are assigned to variables. This latter process is known as unpacking and is an extremely powerful technique used in a number of situations, including when you want to return multiple values from a function. The only catch when you unpack a tuple is that you must have a variable for every item in the tuple (see Listing 9).

Listing 9. Python Interpreter: Packing and unpacking a tuple
>>> i = 1
>>> s = "two"
>>> f = 3.0
>>> t = (i, s, f)         # Pack the variables into a tuple
>>> t
(1, 'two', 3.0)
>>> ii, ss, ff = t        # Unpack the tuple into the named variables
>>> ii
1
>>> ii, ff = t            # Not enough variables to unpack three element tuple
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ValueError: too many values to unpack

Simplifying concepts

While it might seem overly confusing, the object nature of Python actually simplifies some of the more difficult concepts a newcomer to the Python language often faces. After you understand how to work with an object, the fact that everything is an object means you already have a leg up on understanding new concepts, like Python's container types. Making difficult tasks easy is one of the common benefits of using Python; another example is the built-in help utility, which is available within the Python interpreter by simply entering help() at the Python prompt. Because life isn't described by simple concepts, Python provides a rich set of container, or collection, objects. I introduced the simplest of these objects -- the tuple -- in this article. To use a tuple properly, you need to be familiar with how it works. But because many of the other container types share similar functionality, including slicing and packing or unpacking, understanding how a tuple works means you are already on your way to completely understanding the other container types available in Python.

Resources

  • Download Python from the Python home page.
  • When you have a working Python interpreter, the Python tutorial is a great place to start learning the language.
  • Read the developerWorks article "Guide to Python introspection: How to spy on your Python objects" by Patrick O'Brien, which includes a discussion of the Python help utility.
  • Python Library Reference has a nice discussion of the sequence types available within in Python.
  • The Python tutorial has a section on the tuple container type.
  • Visit the developerWorks Open source zone for extensive how-to information, tools, and project updates to help you develop with open source technologies and use them with IBM's products.
  • Innovate your next open source development project with IBM trial software, available for download or on DVD.
  • Browse for books on these and other technical topics.
  • Get involved in the developerWorks community by participating in developerWorks blogs.

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Open source on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Open source
ArticleID=84215
ArticleTitle=Discover Python, Part 2: Explore the Python type hierarchy
publish-date=05312005