Connecting databases to Python with SQLObject

Say goodbye to writing (most) SQL

An object-relational mapping tool helps improve your productivity by providing classes and objects to manipulate database tables. The best object-relational mapping tool for Python is SQLObject -- an open-source project that does just about everything you might need to program a database. This article introduces SQLObject and its capabilities. After reading this article, you'll be able to connect Python to databases without writing any SQL code.

Share:

Leonard Richardson (leonardr@segfault.org), Software Engineer, CollabNet

Leonard Richardson is the author of many Python applications and libraries, including NewsBruiser and Beautiful Soup. He is a co-author of the new tome Beginning Python from Wrox.



02 August 2005

You'll often see an object-relational mapping when the object-oriented programming paradigm meets the relational paradigm of most databases. An object-relational mapping is a bridge between the two worlds. It lets you define classes that correspond to the tables of a database. You can then use methods on those classes and their instances to interact with the database, instead of writing SQL. Using an object-relational mapping doesn't mean that you don't need to know how relational databases work, but it does save you from having to write SQL -- a common source of programming errors.

You can find more than a dozen open source Python packages for working with a SQL database, not counting the special-purpose modules that connect Python to specific databases. SQLObject is the best such module. It's a full object-relational mapping package that's simple to use. SQLObject can do just about everything you might need to program a database.

This article shows you how SQLObject interacts with a database, how to use SQLObject to write database access and data-validation code, and how to use it with legacy or pre-existing databases. I assume that you have some knowledge of Python and relational databases.

Installing and setting up SQLObject

SQLObject has a setup.py file and installs in the same way as any other Python package. If you're using Python V2.2, you also need to install the mxDateTime Python package (SQLObject uses Python V2.3's built-in datetime module, if it's available).

To actually use SQLObject, you need to set up a database package and the Python interface to that type of database. SQLObject connects to several types of databases, including the three big open source products: MySQL, PostgreSQL, and the serverless SQLite.

Finally, you need to create a database for your application. For SQLite, this means creating a file in which to store the database. For other databases, it means connecting to the database server, issuing a CREATE DATABASE command, and granting some database user access to the new database, so that SQLObject can use that user account to connect.

Listing 1 shows how to create a new database in MySQL.

Listing 1. Code to create a new database in MySQL
mysql> use mysql;
Database changed
mysql> create database sqlobject_demo;
Query OK, 1 row affected (0.00 sec)
mysql> grant all privileges on sqlobject_demo to 'dbuser'@'localhost' 
identified by 'dbpassword';
Query OK, 0 rows affected (0.00 sec)
mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)

Connecting to the database

The first Python code you need to write is the database connection code. This is the only place you need to write different code based on which database you're using.

For instance, if you want your application to use a SQLite database, you need to pass the path to the database file into the SQLite connection builder located in the sqlobject.sqlite package. If the database file doesn't exist, SQLObject will tell SQLite to create it with code such as:

import sqlobject
from sqlobject.sqlite import builder
conn = builder()('sqlobject_demo.db')

If you're using MySQL, or another database with a server, you pass in database connection information to the connection builder. Listing 2 provides an example for the MySQL database created in the previous section.

Listing 2. Code to pass in MySQL database connection information
import sqlobject
from sqlobject.mysql import builder
conn = builder()(user='dbuser', passwd='dbpassword',
                 host='localhost', db='sqlobject_demo')

Whatever database you're connected to, the connection code should go in a file called something like Connection.py, stored in some commonly accessible location. That way, you can import all the classes you define and use the conn object you've built. The conn variable will take care of all database-related details.

Note, however, that some features of SQLObject are not available in SQLite or MySQL. You can't totally separate your choice of database from the code you write after you've connected. (See Using SQLObject with pre-existing tables for more information.)


Defining the schema

SQLObject makes it easy to act on a database table. As a first example, consider a database schema consisting of a single table for a phone book application as shown in Table 1.

Table 1. Description of the phone_number table
FieldTypeNotes
idIntPrimary key
numberString"(###) ###-####" string format; should be unique
ownerStringWhose number is this?
last_callDateWhen did the user last call this number?

The SQL for this table would look something like this, depending on your flavor of SQL:

CREATE TABLE phone_number (
  id INT PRIMARY KEY AUTO_INCREMENT,
  number VARCHAR(14) UNIQUE,
  owner VARCHAR(255),  
  last_call DATETIME,
  notes TEXT
)

With SQLObject, you don't need to write this SQL code. You define the database table by defining a Python class. This code goes into a file called PhoneNumber.py, as shown in Listing 3.

Listing 3. PhoneNumber.py
import sqlobject
from Connection import conn

class PhoneNumber(sqlobject.SQLObject):
    _connection = conn
    number = sqlobject.StringCol(length=14, unique=True)
    owner = sqlobject.StringCol(length=255)
    lastCall = sqlobject.DateTimeCol(default=None)
PhoneNumber.createTable(ifNotExists=True)

Here's where you use the conn variable defined earlier. Each of your table classes needs to store a reference to a database connection in its _connection member. This information is used behind the scenes for all the database access to this class' table. From this point on, you don't have to worry about SQL or any particular database because your code can be expressed in terms of the abstract relational schema.

A class that defines a table also has a set of members that define the table's fields. SQLObject provides StringCol, BoolCol, and so on -- one class for each of the database field types.

The first time the createTable() method runs, SQLObject will create a table called phone_number. Afterward, it will just use that table because you call that method with ifNotExists set to True.

Finally, note that there's no need to create a field object in PhoneNumber for the id field. Because SQLObject always needs this field object, it always creates one.


Handling that old CRUD

The famous acronym CRUD represents the four things you might do to a database row: Create, Read, Update, and Delete. After you've defined a class that corresponds to a database table, SQLObject represents operations on the rows of that table as operations on the class and its instances.

Create

To create a database row, you create an instance of the corresponding class with the code:

>>> from PhoneNumber import PhoneNumber
>>> myPhone = PhoneNumber(number='(415) 555-1212', 
owner='Leonard Richardson')

Now the phone_number table has a row with my name in it. The PhoneNumber constructor takes values for the table columns as keyword arguments. It creates a row of phone_number using the data you provide. If, for some reason, that data can't go into the database, the constructor throws an exception. Here's what happens when the phone number isn't valid:

Listing 4. Invalid phone number
>>> badPhone = PhoneNumber()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
...
TypeError: PhoneNumber() did not get expected keyword argument number

And here's what you'd see if the phone number was already in the database:

Listing 5. Phone number already in database
>>> duplicatePhone = PhoneNumber(number="(415) 555-1212")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
...
TypeError: PhoneNumber() did not get expected keyword argument owner

Read (and query)

An instance of an SQLObject class has all its fields available as members. This is in contrast to some other database-mapping tools that make a database row act like a dictionary. Thus, a PhoneNumber object has a member for each of the fields in its database row:

Listing 6. Fields available as members
>>> myPhone.id
1
>>> myPhone.owner
'Leonard Richardson'
>>> myPhone.number
'(415) 555-1212'
>>> myPhone.lastCall == None
True

But how do you retrieve PhoneNumber objects that are already in the database? You need to get them by running a query against the database. This brings up SQLObject's query construction kit, one of the package's most interesting features. It lets you represent SQL queries as chains of Python objects. It's similar to what you can do with Criteria objects in Torque, if you're familiar with that object-relational package for the Java™ programming language.

Each SQLObject class you define has a select() method. This method takes an object that defines a query and returns a list of items that match that query. For instance, this method call returns a list containing the first phone number in the database:

Listing 7. The select() method
>>> PhoneNumber.select(PhoneNumber.q.id==1)
<sqlobject.main.SelectResults object at 0xb7b76cac>
>>> PhoneNumber.select(PhoneNumber.q.id==1)[0]
<PhoneNumber 1 number='(415) 555-1212' lastCall=None 
owner='Leonard Richardson'>

PhoneNumber.q.id indicates that you want to run a query on the id field of the phone_number table. SQLObject overloads the comparison operators (==, !=, <, >=, and so on) to make queries out of what would otherwise be Boolean expressions. The expression PhoneNumber.q.id==1 becomes a query that matches every row whose id field has a value of 1. Here are some more examples:

Listing 8. More examples
>>> PhoneNumber.select(PhoneNumber.q.id < 100)[0]
<PhoneNumber 1 number='(415) 555-1212' lastCall=None 
owner='Leonard Richardson'>
>>> PhoneNumber.select(PhoneNumber.q.owner=='Leonard Richardson').count()
1
>>> PhoneNumber.select(PhoneNumber.q.number.startswith('(415)')).count()
1

You can use SQLObject's AND and OR functions to combine query clauses:

Listing 9. AND and OR functions to combine query clauses
>>> from sqlobject import AND, OR
>>> PhoneNumber.select(AND(PhoneNumber.q.number.startswith('(415)'),
>>>                        PhoneNumber.q.lastCall==None)).count()
1

The following query gets everyone you haven't called in a year and everyone you've never called at all:

Listing 10. Everyone you haven't called in a year and everyone you've never called
>>> import datetime
>>> oneYearAgo = datetime.datetime.now() - datetime.timedelta(days=365)
>>> PhoneNumber.select(OR(PhoneNumber.q.lastCall==None,
...                       PhoneNumber.q.lastCall < oneYearAgo)).count()
1

Update

If you change a member of a PhoneNumber object, the change is automatically mirrored to the database:

Listing 11. Change automatically mirrored to database
>>> print myPhone.owner
Leonard Richardson
>>> print myPhone.lastCall
None
>>> myPhone.owner = "Someone else"
>>> myPhone.lastCall = datetime.datetime.now()
>>> #Fetch the object fresh from the database.
>>> newPhone = PhoneNumber.select(PhoneNumber.q.id==1)[0]
>>> print newPhone.owner
Someone else
>>> print newPhone.lastCall
2005-05-22 21:20:24.630120

There's just one caveat: SQLObject won't let you change the primary key of an object. It's usually best to let SQLObject manage the id field of your table, even if you happen to have another field you'd make the primary key if you weren't using SQLObject.

Delete

You can delete a particular row object by passing its ID into its class' delete() method:

Listing 12. Delete a row object
>>> query = PhoneNumber.q.id==1
>>> print "Before:", PhoneNumber.select(query).count()
Before: 1
>>> PhoneNumber.delete(myPhone.id)
>>> print "After:", PhoneNumber.select(query).count()
After: 0

Validating and converting data

So far, you've been passing phone numbers in 14-character U.S. format. The database schema is designed to accept numbers in that format, which implies the underlying application -- whatever it might be -- expects them in that format. As it is, though, the code doesn't do anything to make sure that clumsy users or programmer bugs don't cause incorrectly formatted data to be put into the number field.

SQLObject solves this problem by letting you define hook methods for validating and massaging incoming data. You can define one method for each field in a table. A field's hook method is named _set_[field name](), and it gets called whenever a value is about to be set for that field, whether as part of a create operation or an update operation. The hook method should (optionally) massage an incoming value into an acceptable format ,and then set it. Otherwise, it should throw an exception. To actually set a value, the method needs to call the SQLObject method _SO_set_(field name).

Listing 4 shows a _set_number() method for PhoneNumber. If a phone number is totally without formatting, as in 4155551212, the method formats the number as (415) 555-1212. Otherwise, if the number is not in the correct format, the method throws a ValueError. A correctly formatted phone number -- or one that got massaged into a correct format -- is passed right on to SQLObject's _SO_set_number() method.

Listing 13. The _set_number () method for PhoneNumber
   import re
   def _set_number(self, value):
      if not re.match('\([0-9]{3}\) [0-9]{3}-[0-9]{4}', value):
         #It's not in the format we expect.
         if re.match('[0-9]{10}', value):
            #It's totally unformatted; add the formatting.
            value = "(%s) %s-%s" % (value[:3], value[3:6], value[6:])
         else:
            raise ValueError, 'Not a phone number: %s' % value
      self._SO_set_number(value)

Defining relationships among tables

So far, everything you've seen operates on a single table: phone_number. But real database applications usually have multiple interrelated tables. SQLObject lets you define relationships among tables as foreign keys. To demonstrate, let's apply a little database normalization to the previous example, splitting the owner field of PhoneNumber into a separate person table. The code shown in Listing 14 goes into a file called PhoneNumberII.py.

Listing 14. Code for PhoneNumberll.py
import sqlobject
from Connection import conn


class PhoneNumber(sqlobject.SQLObject):
    _connection = conn
    number = sqlobject.StringCol(length=14, unique=True)
    owner = sqlobject.ForeignKey('Person')
    lastCall = sqlobject.DateTimeCol(default=None)


class Person(sqlobject.SQLObject):
    _idName='fooID'
    _connection = conn
    name = sqlobject.StringCol(length=255)
    #The SQLObject-defined name for the "owner" field of PhoneNumber
    #is "owner_id" since it's a reference to another table's primary
    #key.
    numbers = sqlobject.MultipleJoin('PhoneNumber', joinColumn='owner_id')


Person.createTable(ifNotExists=True)
PhoneNumber.createTable(ifNotExists=True)

This PhoneNumber class has the same members as the old one, but its owner member is a reference to the primary key of the person table, instead of a reference to a string column in the phone_number table. This makes it possible to represent a single individual with two phone numbers:

Listing 15. Represent a single individual with two phone numbers
>>> from PhoneNumberII import PhoneNumber, Person
>>> me = Person(name='Leonard Richardson')
>>> work = PhoneNumber(number="(650) 555-1212", owner=me)
>>> cell = PhoneNumber(number="(415) 555-1212", owner=me)

The numbers member of Person, an SQLObject MultipleJoin, makes it easy to do a query based on a join of person to phone_number:

Listing 16. Query based on join of person to phone_number
>>> for phone in me.phoneNumbers:
...     print phone.number
...
(650) 555-1212
(415) 555-1212

Similarly, SQLObject lets you do many-to-many joins, using the MultipleJoin class.


Using SQLObject with pre-existing tables

One common use of SQLObject is to give a Python interface to a database created by another application. SQLObject has several features that help do this.

Database introspection

If you're working with tables that already exist in a database, you don't need to define the columns in Python. SQLObject can extract all the information it needs through introspection on the database. For example, the code in Listing 17 could go into PhoneNumberIII.py.

Listing 17. Code for PhoneNumberlll.py
import sqlobject
from Connection import conn
class PhoneNumber(sqlobject.SQLObject):
    _connection = conn
    _fromDatabase = True

This class will take on the properties of the existing phone_number database table. You can interact with it just as if you had manually defined the class with all its columns, as in the previous examples. With SQLObject, you need to write the table definition only once -- whether you do it in SQL or in Python is up to you.

However, this feature brings your choice of database back into the picture. For instance, the feature doesn't work at all with SQLite. It works at a basic level with MySQL, but it won't pick up foreign key relationships. If you're using MySQL and want to define foreign keys for a table, you'll need to define those fields in code after loading the schema from the database.

Naming conventions

The code in the previous section assumes that the pre-existing table conforms to SQLObject's naming conventions (for instance, a table's primary key field is called id, and words in column names are separated by underscores). The naming conventions for a table are defined in a Style class.

SQLObject provides some Style classes corresponding to common database naming conventions. For instance, if your column names look likeThis, instead of like_this, you can use the MixedCaseStyle:

Listing 19. Using the MixedCaseStyle
import sqlobject
from sqlobject.styles import MixedCaseStyle
from Connection import conn
class PhoneNumber(sqlobject.SQLObject):
    _connection = conn
    _fromDatabase = True
    _style = MixedCaseStyle

If none of the prepackaged Style classes fit your needs, you can subclass the base Style class and define your own naming conventions. In the worst case, where a table's field names have been assigned with no rhyme or reason at all, you can name each field individually.


A note on SQLObject limitations

SQLObject wants you to think in object-oriented terms instead of relational terms. This is good for your comprehension and your programming productivity, but it's not so good for performance. After all, the database is still relational. How do you mark every phone number in the database as having been called? With SQL, you would use a single UPDATE command. With SQLObject, you need to iterate over the entire result set and modify the last_call member of each object, which is much less efficient.

SQLObject sacrifices processor time for developer time. This is usually a good deal, but even in simple applications, you may need to drop down a level to the Python database interface and write raw SQL for some critical-path operations.


Conclusion

This article has covered SQLObject in breadth, but not much depth. It's a versatile tool with many small, convenient features. (I haven't even mentioned result slices, just to pick one.) Its limitations are easy to understand, and you can work around them by writing the SQL you need. SQLObject is to relational database programming what Python is to application programming: a convenient way to get the work done in a lot less time.


Download

DescriptionNameSize
Sample classes definedos-sqlobject-samplecode.zip3.36 KB

Resources

Learn

  • See a comparison of database-oriented Python libraries.
  • See the SQLObject documentation and examples.
  • MySQL is the most common open source database. Its Python interface module is called MySQLdb.
  • PostgreSQL has more high-end database features (for instance, triggers) than MySQL. It has a number of Python interfaces, including PyGreSQL and pyPgSQL.
  • SQLite is a bit of a database upstart. It offers the most commonly used features of SQL in a small footprint that doesn't require a server to run.
  • Visit the developerWorks Open source zone for extensive how-to information, tools, and project updates to help you develop with open source technologies and use them with IBM's products.

Get products and technologies

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Open source on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Open source
ArticleID=90949
ArticleTitle=Connecting databases to Python with SQLObject
publish-date=08022005