It seems everybody has their favorite persistence mechanism in Perl. It goes with the territory -- everybody has slightly different requirements and therefore everybody needs something slightly different. Tangram, Alzato, SPOPS, and more are all valid, capable solutions to the problem of persistence in Perl, yet all share one common factor -- they store some sort of map, or schema, somewhere.
These maps simply join the correct place in the database to the field of the object that you require. From that you can recreate an object from the database, or place an object from memory into the database. A friend of mine once said "a schema is like a template, only backwards."
Having requirements set in stone doesn't hinder you -- the database schema sits with the class, and you don't need to worry. However, if your requirements are going to change (and whose aren't?), then you'll run into problems with having to change both the map and the class. Additionally, I tend not to think in terms of an RDMS, but in terms of objects.
Pixie offers a slightly different approach in the sense that it doesn't require a schema or map from the class to the database -- it just stores objects.
For me, keeping track of things I'm supposed to do is a todo item in
itself. For that reason, and to show off Pixie's capabilities, I've been
playing around writing a todo application. The first thing I wrote was a Todo class:
Listing 1. A Todo class
package Todo;
use strict;
sub new {
my $class = shift;
my $self = {};
bless $self, $class;
}
sub text {
my $self = shift;
my $text = shift;
if (defined( $text )) {
$self->{ text } = $text;
return $self;
} else {
return $self->{ text };
}
}
sub completed {
my $self = shift;
my $comp = shift;
if (defined( $comp )) {
$self->{ completed } = scalar(localtime(time()));
return $self;
} else {
return $self->{ completed };
}
}
|
This is a very basic Perl class that allows me to do three things:
instantiate a new object (the new() method), get/set the value of the
text attribute in this object, and mark the todo item as completed on a
certain date/time.
Using it is fairly simple:
Listing 2. Using Todo
use Todo; my $todo = Todo->new(); $todo->text( "Finish writing this article" ); |
What my Todo class doesn't allow me to do yet is save that todo item so
I can see it again.
This isn't a problem, however, because Pixie will allow me to do this by changing the above code to look like this:
Listing 3. Adding the Pixie
use Todo;
use Pixie;
my $pixie = Pixie->new()->connect('bdb:todos.bdb');
my $todo = Todo->new();
$todo->text( "Finish writing this article" );
my $cookie = $pixie->insert( $todo );
|
A lot is happening here, so I'll try to describe it line by line. First, I tell Perl that I want to use the Pixie module, and then I connect Pixie to a database. Pixie connects to a lot of different databases, but for now I'm going to be using BerkeleyDB 4.0, as the setup work is minimal. The final change is to insert the Todo object into the database, which I do in the last line by sending the insert message to the Pixie instance with the object I want to insert as the parameter.
When I call insert, I get a string back, which
we know as a "cookie." This string uniquely identifies the object that I
asked to Pixie to insert, and I can use it to get the object back at any
time with the following code:
my $todo = $pixie->get( $cookie );
Of course we don't just want to store one todo item; we need lots, and those cookies would prove a bit of a nightmare to remember. To solve my problem, the first thing I'm going to write is a collection class:
Listing 4. A collection of Todos
package TodoCollection;
use strict;
sub new {
my $class = shift;
my $self = [];
bless $self, $class;
}
sub add_todo {
my $self = shift;
push @$self, grep { defined $_ and $_->isa('Todo') } @_;
}
sub delete_todo {
my $self = shift;
my $index = shift;
splice(@$self,$index, 1)
}
sub todo_at_index {
my $self = shift;
return $self->[$_[0]];
}
sub todolist {
my $self = shift;
return $self;
}
|
This is a fairly simple class that lets me maintain a list of todos in
another object called a TodoCollection. I can instantiate my collection:
my $todolist = TodoCollection->new();
Add a todo object:
$todolist->add_todo( Todo->new()->text('write developerworks article') );
Get the todo item back:
$todolist->todo_at_index( 0 );
And delete a todo:
$todolist->delete_todo( 0 );
I can also get a list of all the todo items:
foreach my $todo (@{$todolist->todolist}) {
print $todo->text, "\n";
}
I know I can store my list of todos and receive a cookie that will let
me get my TodoCollection object back, but the cookie is still a bit of a
hassle to remember, and I'd like something that's a little more useful to
me, rather than to the computer. Pixie lets me do this as part of its
basic services to the user:
$pixie->bind_name( 'Todo list' => $todolist );
I've given Pixie the name of my object (Todo list), and asked it to be stored as well in this one line of code. Pixie will remember not only the todo list object, but also all the objects contained within it.
To get my object back, I simply ask Pixie:
my $todolist = $pixie->get_object_named( 'Todo list' );
Pixie remembers that the name Todo list is associated with my
TodoCollection object, and hands it back to me. Pixie doesn't give it
back the TodoCollection in entirely the same state, however. Often when we
are getting an object that holds other objects, we're not concerned with
all of them. For instance, I may be concerned with only the number of
items I have on my Todo list, and not, the actual todo items themselves.
Loading all of the stored objects could be an overhead for no particular
reason. Pixie steps in again, and creates proxy objects in place of the
todo objects that are normally in the collection.
The proxy objects work without programmer interaction. When you call a method on one of Pixie's proxy objects, it magically connects to the pixie datastore and fetches the real object for you.
Sometimes you need to store things that Pixie wouldn't normally be able to store. For instance, when Perl interfaces with a C library, a common trick is to store a pointer to the C structure in a Scalar reference, which Perl can use as its object, and the XS glue code can use to get the real C structure back. This is a great mechanism to write glue code, but if you want to store those sorts of objects in Pixie, it causes problems, because it's important that we have a means of recreating the link when Pixie gives the object back to Perl in another process. Because the most common case for using a blessed scalar reference is to perform glue magic, Pixie by default will consider these objects unstorable. However, this isn't always the case; Pixie provides all the mechanisms needed to be able to change its default behaviour with its Complicity hooks.
To demonstrate this, I'll use this example from the Pixie
documentation: Set::Object. The Set::Object class implements a nice,
fast set. For Pixie however, it doesn't store directly, as it is a class
based on some C code. First, we let Pixie know that Set::Object can be
stored.
Pixie calls the method px_is_storable on the object it wants to store
just before it attempts to do so. We need to change this, so we override
the default px_is_storable method in the Set::Object namespace:
sub Set::Object::px_is_storable { 1 }
Pixie will now know that your class is capable of being stored.
However, it won't actually be able to store it yet, as it won't be able to
get at the data. For this purpose we can override the method px_freeze,
which performs a "setup" on your data prior to it being stored, and
the method px_thaw, which will be called when your data is
extracted.
Set::Object provides a members method, which returns a Perl array of
the members of the set. We'll take advantage of this to be able to store
a meaningful representation of our class:
Listing 5. Storing the class
sub Set::Object::px_freeze {
my $self = shift;
return bless [ $self->members ], 'Memento::Set::Object';
}
|
We create a new object of the class Memento::Set::Object that has
just enough information in it that we can use the data to recreate our
original object. We achieve this by overriding the px_thaw method in the
Memento::Set::Object class:
Listing 6. Creating a new object of the class
sub Memento::Set::Object::px_thaw {
my $self = shift;
return Set::Object->new( @$self );
}
|
When Pixie fetches an object of the Memento::Set::Object class from the
database, it will call the px_thaw method on it. When this happens, our
new px_thaw method will create a Set::Object instance and place the
contents of the Memento object in it. This will have the effect of
recreating the original Set::Object instance, and it will appear to the
user as if none of the magic ever happened.
There are other methods that can help Pixie with storing or fetching
objects, but I'd recommend that you read the Pixie::Complicity
documentation to find out more.
Pixie does have its drawbacks. Although Pixie can use an RDBMS as a storage
mechanism, it doesn't decompose your object into relational data of any
sort. If you need to match on an element of an object field, Pixie
probably isn't the tool for you. Fairly complex indexes can be built up
by simply giving objects the right names. For example, if I wanted to use
Pixie to store my user database, it would probably be a good idea to
create a collection similar to the TodoCollection I created above.
Additionally, I would want to store each of the users with names according
to how I want to search for them. When I want a user to log in, for
example, the first thing I need to be able to search on would be
the username. Therefore, it makes sense to put the user object into
a named collection and store it inside Pixie with a name such as
user:username:<the users username>. If I wanted to look up users by
their e-mail addresses as well, I'd also give the object a name such as
user:email:<the users email address>. A scheme like this provides a
means of getting all the users through the collection, and individual
users by virtue of the naming convention. Through careful creation of
your collection class, you could create the ability to fetch groupings of
users, but this is a much more technical and specific case. The real
issue with all of these techniques remains: you can't add the indices
later on. Really, if you want to do ad-hoc reporting on your data, Pixie
is not the right tool.
Pixie is the right tool for lots of other things, however. Writing schemas or maps tends to tie up programmers with doing things that aren't programming. Too often, a database schema is an artifact of wanting to write a piece of software, and, if eliminated, can speed up development. Years of RDMBS usage has made programmers view them as ubiquitous, when often they need not be. Pixie steps in to help out with Perl, but many languages have OODBMS implementations available for them that should be examined alongside more traditional mechanisms.
Pixie runs on the 5.8.0 version of Perl (and up), and is available from all good CPAN mirrors under the GNU GPL and Artistic licenses.
-
Pixie
and Perl are
available from CPAN.
-
Tangram is
an object-relational mapper that can be used to persist objects in a
relational database.
-
Alzato is a
data-modeling tool as well as a object-relational mapper.
-
SPOPS
is a robust and powerful module that allows you to serialize objects.
- Learn more about
Pixie from this article in Perl Advent.
- Like Perl itself, Pixie is licensed under both the GNU General Public License
(GPL) and the Perl Artistic
License (PAL).
- The developerWorks article "Python persistence management" gives a general overview of persistence mechanisms available to Python programmers.
- The developerWorks article "Mapping objects to relational databases" takes a look at the "impedence mismatch" problem between the object world and the relational world.
- The developerWorks article "Choosing a database management system" gives a programmer's view of the DBMS landscape.
- For a tutorial-based approach to installing and using a Perl interface to DB2, read the tutorial "Using Perl to access DB2 for Linux".
- "The Camel and the Snake, or 'Cheat the Prophet'
Open Source Development with Perl, Python and DB2" gives some background on open source and covers using Perl and Python with DB2.
- Find more resources for Linux developers in the developerWorks Linux zone.
James is Chief Scientist at Fotango, which means most of his day is spent applying interesting new solutions to classic problems and applying classic solutions to interesting new problems. When not at work, he spends his time failing miserably at all kinds of DIY. You can contact James at jduncan@fotango.com.





