The busy Java developer's guide to db4o: Introduction and overview

Learn to love the OODBMS all over again

It has been said that the database wars are over and the relational database won. However, anyone who believes this state of affairs has led to peace and prosperity among programmers hasn't tried using a relational database to back Java™ objects lately. Popular author and lecturer Ted Neward launches an in-depth, multipart series introducing db4o, an object-oriented alternative to today's relational databases.

Ted Neward, Principal, Neward & Associates

Ted Neward photoTed Neward is the principal of Neward & Associates, where he consults, mentors, teaches, and presents on Java, .NET, XML Services, and other platforms. He resides near Seattle, Washington.



20 March 2007

Also available in Chinese Russian Japanese

By the time I came of age as a programmer, it seemed the database wars were pretty much over. Oracle and other relational database vendors had made a convincing argument for the relational model and its standardized query language, SQL. In fact, I can safely say that I've never used any of the relational database's immediate ancestors, such as IMS or the ubiquitous flat file, for long-term storage. Client/server, it seemed, was here to stay.

And then one day, I discovered C++. And with that, like so many others who found this particular language at this particular time, my world view was forever changed. I made the shift from a programming model based on functions and data to one based on objects. Suddenly developers weren't talking anymore about building elegant data structures and "information hiding"; instead we were excited about polymorphism, encapsulation, and inheritance -- a whole new set of buzzwords.

About this series

Information storage and retrieval has been nearly synonymous with RDBMS for about a decade now, but recently that has begun to change. Java developers in particular are frustrated with the so-called object-relational impedance mismatch, and impatient with the solutions that attempt to resolve it. This, along with the emergence of a viable alternative, has led to a renaissance of interest in object persistence and retrieval. This series is a working introduction to db4o, an open source database that leverages today's object-oriented languages, systems, and mindset. See the db4o home page to download db4o now; you'll need it to follow the examples.

Likewise, it suddenly seemed that the relational database was over-the-hill in favor of a new kind of database, the object database. When married with an object-oriented language like C++ (or its upstart cousin, the Java language), the OODBMS would herald a utopia in programming.

Except it didn't quite work out that way. The OODBMS peaked around the late '90s and then slid back into relative obscurity. What had been exciting and glamorous became obscure and proprietary. Round 2 of the database wars was over, and once again the relational database had won. (This even despite most RDBMS vendors embracing objects in one fashion or another.)

The only problem with this scenario is that some of the reasons developers were excited about the OODBMS never died, as evidenced by the emergence of db4o.

Of objects and relations

The object-relational impedance mismatch is a subject best suited to an academic lecture, but it boils down to the fact that an object system takes a different view than a relational system does of how entities interact with each other. On the surface, an object system and a relational system seem well-suited, but a deeper investigation reveals some fundamental differences between them.

For starters, objects have an implicit sense of identity (denoted by the hidden/implicit this pointer or reference, which is essentially a location in memory), whereas relations have an explicit sense of identity (identified by a primary key made up of the relation's attributes). Secondly, a relational database practices encapsulation by hiding the database-wide implementation of queries and other operations against the data, whereas an object implements new behavior on each object (modulo whatever implementation inheritance is specified in the class definition, of course). And, perhaps most interestingly, the relational model is a closed model, where the results of any operation will yield a tuple set suitable as input for another operation. This enables nested SELECTs, among other things. An object model offers no such capability, particularly to return "partial objects" to callers. Objects are all-or-nothing, so it follows that the OODBMS has nothing remotely like the RDBMS's ability to return either, all, or just some of the columns from a table or a set of tables.

In short, there's a wide gulf between how objects (as implemented in languages like Java code, C++, and C#) and relations (as implemented by modern RDBMSes such as SQLServer, Oracle, and DB/2) operate. Bridging the gap naturally falls on the programmer.


Where mapping fails

In times past, developers have tried to bridge the object-relational gap by way of manual mapping, such as writing SQL statements through JDBC and harvesting the results into fields. The reasonable question that arises from this is whether there is an easier way to proceed. Many developers resolve the issue with an automated object-relational-mapping utility or library such as Hibernate.

Even using Hibernate (or JPA, or JDO, or Castor JDO, or Toplink, or any of the other ORM tools available), mapping problems don't go away, however -- they just move into configuration files. What's more, there's no avoiding the feeling that you're pushing round pegs into square holes. For example, if you're trying to create a nicely-stratified inheritance model, mapping it to a table or set of tables involves weighing one ugly trade-off against another. Weighing query performance against violation of normal form ends up pitting DBA against developer at some point.

The problem here is that it's hard to get really excited about building a rich domain model (a la Martin Fowler or Eric Evans's respective books) if you're then going to have to either compromise it in order to match an existing database schema, compromise the database's ability to carry out its operations to support the object model, or both.

But what if you didn't have to make any compromise at all?


Enter db4o: OODBMS redux

The db4o library is a recent player in the OODBMS world, revitalizing the notion of "pure object storage" for a new generation of object developers. (They say retro is hot these days, after all.) To get an idea of what it's like to use db4o, consider this basic class representing an individual human being:

Note: If you haven't already done so, now is the time to download db4o. You need it to follow the discussion (or at least compile the code) for the remainder of the series.

Listing 1. The Person class
package com.tedneward.model;

public class Person
{
    public Person()
    { }
    public Person(String firstName, String lastName, int age)
    {
        this.firstName = firstName;
        this.lastName = lastName;
        this.age = age;
    }
    
    public String getFirstName() { return firstName; }
    public void setFirstName(String value) { firstName = value; }
    
    public String getLastName() { return lastName; }
    public void setLastName(String value) { lastName = value; }
    
    public int getAge() { return age; }
    public void setAge(int value) { age = value; }

    public String toString()
    {
        return 
            "[Person: " +
            "firstName = " + firstName + " " +
            "lastName = " + lastName + " " +
            "age = " + age + 
            "]";
    }
    
    public boolean equals(Object rhs)
    {
        if (rhs == this)
            return true;
        
        if (!(rhs instanceof Person))
            return false;
        
        Person other = (Person)rhs;
        return (this.firstName.equals(other.firstName) &&
                this.lastName.equals(other.lastName) &&
                this.age == other.age);
    }
    
    private String firstName;
    private String lastName;
    private int age;
}

As classes go, the Person class is pretty unremarkable; simplistic, even. Over time, however, it's fair to assume that this class will assume some more interesting and object-like properties and capabilities, such as possibly having a spouse, possibly having children, and so on. (I'll flesh out these various bits in future columns; for now I'm sticking with the overview.)

In a Hibernate-based system, getting an instance of this Person class into the database would require a couple more steps:

  1. You would need to create the relational schema, describing the types to the database.
  2. You would need to create the mapping files that map the columns and tables of the database to the class and fields of your domain model.
  3. In code, you would need to open a connection to the database through Hibernate (a Session, in Hibernate terminology), and interact with the Hibernate API in order to store the object and fetch it back again.

Doing this in db4o is almost frighteningly simple, as shown in Listing 2:

Listing 2. Running INSERT in db4o
import com.db4o.*;

import com.tedneward.model.*;

public class Hellodb4o
{
    public static void main(String[] args)
        throws Exception
    {
        ObjectContainer db = null;
        try
        {
            db = Db4o.openFile("persons.data");

            Person brian = new Person("Brian", "Goetz", 39);
            
            db.set(brian);
            db.commit();
        }
        finally
        {
            if (db != null)
                db.close();
        }
    }
}

And that's it. No schema files to generate, no mapping configuration to create, just run the client and, when it finishes, check the local directory for the new "database" stored in persons.data.

Retrieving the Person stored is similar in some ways to how certain object-relational-mapping libraries operate, in that the simplest form of object retrieval is a query-by-example approach. Simply provide db4o with a prototype object of the same type, whose fields are set to the values you want to query by, and it will return a set of objects that match that criteria, as shown in Listing 3:

Listing 3. Running SELECT in db4o (version 1)
import com.db4o.*;

import com.tedneward.model.*;

public class Hellodb4o
{
    public static void main(String[] args)
        throws Exception
    {
        ObjectContainer db = null;
        try
        {
            db = Db4o.openFile("persons.data");

            Person brian = new Person("Brian", "Goetz", 39);
            Person jason = new Person("Jason", "Hunter", 35);
            Person brians= new Person("Brian", "Sletten", 38);
            Person david = new Person("David", "Geary", 55);
            Person glenn = new Person("Glenn", "Vanderberg", 40);
            Person neal = new Person("Neal", "Ford", 39);
            
            db.set(brian);
            db.set(jason);
            db.set(brians);
            db.set(david);
            db.set(glenn);
            db.set(neal);

            db.commit();
            
            // Find all the Brians
            ObjectSet brians = db.get(new Person("Brian", null, 0));
            while (brians.hasNext())
                System.out.println(brians.next());
        }
        finally
        {
            if (db != null)
                db.close();
        }
    }
}

Run this and you will see two objects retrieved.


But, but, but ...!

Before I'm accused of blatant favoritism, let me address some of the common objections to db4o.

But db4o is hardly equivalent to what you get with Oracle, SQLServer, or DB2!
This is absolutely true. db4o is more on scale with a MySQL or HSQL, which makes it large enough for a good many projects. More importantly, overhead is something db4o's developers keep a very close eye on, making it ideal for small and embedded environments. (Remember, too, that my examples here are a simple demo -- as with any small demo, bear in mind that regardless of its full potential compared to other tools, db4o is capable of much more than what you've seen here.)
But I can't query on db4o with JDBC!
This is true as well, though the db4o team had thought about creating a JDBC driver to allow for SQL syntax against the object database, a sort of "relational-object-mapping," if you will. (The team chose not to because it didn't seem really necessary and the performance was pretty bad, according to rumor.) The point, however, is that you're using objects, POJOs, and nothing else in your implementation. Why use SQL if you're not storing relations?
But how will my other programs get to the data?
Well, that depends. If by "other programs" you're referring to other Java code, then simply use the definition of the Person class in those other programs and pass them into the ObjectContainer just as I've done here. In an OODBMS, the class definition itself serves as the schema, so no other tools are necessary to fetch the Person objects. If, however, by "other programs" you mean other languages, then the story is trickier.

For languages that db4o doesn't support, such as C++ or Python, the data is essentially inaccessible except by what means you can build from Java code. db4o is available for C# and other .NET languages, and its data format is cross-compatible between the two, making Java objects available to similarly-defined .NET classes. And if by "other programs" you're referring to reporting tools that use SQL and a standard call-level interface (such as ODBC or JDBC) to interact with the database, then db4o (or any OODBMS, for that matter) is not likely to be a good fit. Lest the wary reader think that reporting is now unavailable to an OODBMS, take heart: many products and projects are emerging with an eye towards exactly that, plus db4o supports what they call "Replication," which allows a db4o instance to replicate data from its own storage format into an RDBMS.
But it's a file!
In this particular case, true; having said that, though, db4o is flexible in where and how it stores the data, including a lightweight client/server option. If you're expecting the kind of redundancy that a fully-featured RDBMS gives you, however, db4o isn't it (though other OODBMSes do offer those kinds of features).
But when I run the example again, I get duplicates! (In fact, I get duplicates every time I run the example.)
Here we're back to discussing the first interesting "quirk" that differentiates an object database and a relational one: identity. As I previously noted, identity in an object system is given by the implicit "this" reference a Java object uses to identify itself in memory. In an object database, this is known as an OID, or object identifier, and the OID is what serves as the primary key in an OODBMS.

When you create a new object and "set" it into the database, the new object has no OID associated with it and therefore receives its own unique OID value. It duplicates, just as an RDBMS would do if it were manufacturing primary keys on every INSERT (a la a sequence counter or auto-increment field). In other words, an OODBMS behaves just as an RDBMS does with respect to primary keys, but the primary key itself is not what a traditional RDBMS (or a programmer used to the RDBMS) thinks it is.

In other words, db4o looks to solve a particular set of problems, not to become the single-stop, one-size-fits-all solution to every persistence problem. In fact, this is what makes db4o refreshingly different from the OODBMS options the first time around: it's not out to convince production IT staff that it's a good idea to completely abandon their investment in a relational database.


In conclusion

Nothing will supplant the traditional, centralized relational database as the "go-to" tool for data storage and manipulation anytime soon. Too many tools, too many years as the established default, and too many programmers who are stuck in the rut of the "we always do a database" cycle guarantee it. In fact, db4o is not technically designed or positioned to challenge the RDBMS in that role.

But an interesting phenomenon arises when you start contemplating the OODBMS in the multitiered, loosely coupled world that the "service-oriented" community is urging us to build: if the goal is a true loose coupling between components (services, tiers, whatever the nom du jour), then it follows that the only degree of coupling is between the caller of a service and the exposed API (or XML types, however you look at it) of the service. No data types, no exposed object model, no shared database -- in essence, your persistence options are an implementation detail. It follows that the persistence options available to you in a wider variety of scenarios have grown by an order of magnitude.

Resources

Learn

Get products and technologies

  • Download db4o: An open source, native Java programming and .NET database.

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Java technology on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Java technology, Open source
ArticleID=202743
ArticleTitle=The busy Java developer's guide to db4o: Introduction and overview
publish-date=03202007