The busy Java developer's guide to db4o: Transactions, distribution, and security

Java enterprise development with db4o

Java™ developers can get a lot of mileage out of storing objects directly in an object-oriented database like db4o. Without support for transactions or the ability to use data in a distributed environment (and keep it secure), however, you probably won't have much use for the OODBMS. In this final installment in The busy Java developer's guide to db4o, Ted Neward shows you how db4o handles three concerns central to Java enterprise development: transactions, distributed data management, and Web application security.

Share:

Ted Neward, Principal, Neward & Associates

Ted Neward photoTed Neward is the principal of Neward & Associates, where he consults, mentors, teaches, and presents on Java, .NET, XML Services, and other platforms. He resides near Seattle, Washington.



11 December 2007

Also available in Chinese Russian

About this series

Information storage and retrieval has been nearly synonymous with RDBMS for about a decade now, but recently that has begun to change. Java developers in particular are frustrated with the so-called object-relational impedance mismatch, and impatient with the solutions that attempt to resolve it. This, along with the emergence of a viable alternative, has led to a renaissance of interest in object persistence and retrieval. The busy Java developer's guide to db4o introduces db4o, an open source database that leverages today's object-oriented languages, systems, and mindset. See the db4o home page to download db4o now; you'll need it to follow the examples.

In this series, I have introduced the essentials of object-oriented data management with db4o. One thing I haven't done, however, is to address how the OODBMS might be used in a Web application and how that might differ from its use in a Swing or SWT application. You could say that I've ignored a whole range of issues that the practicing Java (or .NET) developer cannot afford to ignore.

In part, I've wanted to focus on what is most compelling about the OODBMS, which is object-oriented data storage, manipulation, and retrieval. Also, OODBMS vendors tend to implement core features like transaction management and security similarly to how they're handled by various RDBMSs, with a similarly wide range of options.

In this final installment of The busy Java developer's guide to db4o, I'll address three features you expect and require from any data storage system, be it object-oriented, relational, or otherwise. Get ready to learn how db4o supports application security, distribution, and transactions.

Multiple client connections

The code I've written for the series so far has assumed that there will be just one client to the database. That is, just one logical connection will be created to the database, and all interactions will go through that logical connection. This is a perfectly reasonable assumption for a Swing or SWT application accessing a configuration database or a local storage system. But for a Web application, even one where all the storage is being done within the Web presentation layer, it's not so realistic.

Within the db4o system, it is trivial to open a second logical connection to the database, even when the database resides on the local disk. I only need to add a call to create an ObjectServer first, and obtain the ObjectContainer object from that ObjectServer. Having ObjectServer listen on port 0 tells it to run in "embedded" mode and that no actual TCP/IP port will be opened (or harmed) in the making of this next exploration test.

Listing 1. Embedded connections
@Test public void letsTryMultipleEmbeddedClientConnections()
    {
        ObjectServer server = Db4o.openServer("persons.data", 0);
        
        try
        {
            ObjectContainer client1 = server.openClient();
            Employee ted1 = (Employee)
                client1.get(
                    new Employee("Ted", "Neward", null, null, 0, null))
                .next();
            System.out.println("client1 found ted: " + ted1);
            
            ObjectContainer client2 = server.openClient();
            Employee ted2 = (Employee)
                client2.get(
                    new Employee("Ted", "Neward", null, null, 0, null))
                .next();
            System.out.println("client2 found ted: " + ted2);
            
            ted1.setTitle("Lord and Most High Guru");
            client1.set(ted1);
            System.out.println("set(ted1)");

            System.out.println("client1 found ted1: " +
                client1.get(
                    new Employee("Ted", "Neward", null, null, 0, null))
                .next());

            System.out.println("client2 found ted2: " + 
                client2.get(
                    new Employee("Ted", "Neward", null, null, 0, null))
                .next());
                
            client1.commit();
            System.out.println("client1.commit()");

            System.out.println("client1 found ted1: " +
                client1.get(
                    new Employee("Ted", "Neward", null, null, 0, null))
                .next());

            System.out.println("client2 found ted2: " + 
                client2.get(
                    new Employee("Ted", "Neward", null, null, 0, null))
                .next());
                
            client2.ext().refresh(ted2, 1);
            System.out.println("After client2.refresh()");
            System.out.println("client2 found ted2: " + 
                client2.get(
                    new Employee("Ted", "Neward", null, null, 0, null))
                .next());
            
            client1.close();
            client2.close();
        }
        finally
        {
            server.close();
        }
    }

Refreshing the object view

Note that when I run the test I am careful to put a few output lines in the middle of the test. It's important to track what's happening because, as you know, the db4o system keeps references to objects it has already opened along the way. I want to make sure I know when and where updates to the objects in memory are "passed through" to the second client.

Case in point: when I do the set(ted1) call, the db4o system modifies its internal state to know that the ted1 object is dirty and needs to be updated. It doesn't do the actual update until the implicit transaction is committed, however, using the commit() method on the ObjectContainer. At this point, the data is written to disk, but client2's view of the objects in memory remains stale compared to what's on disk. (Take a glance at the output from the exploration test for a demonstration. You have been running the code in a console window alongside the browser while reading this series, right?)

The fix is simple: client2 refreshes its view of the object graph in memory, using the refresh() method on the extension object (which is returned by ext()). Note that the issue of activation depth becomes a factor here: How deeply into the object graph do you want db4o to descend while refreshing objects? In this case, a single step down is sufficient to retrieve the modified Employee, but clearly this decision should be revisited on a case-by-case basis.

Once its view of the object has been refreshed, client2 sees the changes. A quick query reveals the new title for the company's (apparently rather egotistical) leader.


It will end in tiers

Most of the time, you won't have multiple clients living inside a single process, but across multiple processes. For instance, you might have a cluster of clients inside of a servlet container talking to a single server, in classic client-server style. Making this happen in db4o is almost identical to what you saw in Listing 1. The single trivial difference is that you need a non-zero port number to open the server. The port number signifies the TCP/IP port the server is listening on. As is true of all TCP/IP-based communication, clients must specify the host name and port when connecting.


Securing the fortress

Naturally, once a port is involved, security becomes an issue because you can't allow "just anybody" to connect up to the server and start issuing queries. In traditional RDBMS implementations, the vendor provides a rich and powerful security model, granting access to some or all parts of the database instance based on the credentials (username and password) sent to the database when the connection is opened.

The db4o implementation is no different, at least not in effect. The way the db4o database instance creator sets up the granted security policy is a startling change from the usual RDBMS scenario, however (shown in Listing 2):

Listing 2. Communicates well with others
@Test public void letsTryMultipleNetworkClientConnections()
    {
        ObjectServer server = Db4o.openServer("persons.data", 2771);
        server.grantAccess("client1", "password");
        server.grantAccess("client2", "password");
            // Yes, "password" is a bad password. Don't do this in production
            // code. I get to do it only because I have a special Pedagogical
            // Code License from Sun Microsystems. And you don't. So don't do
            // this. Don't make me come over there. I'm serious. 
            // Fuggedaboutit. Never. Not ever. Capice?
        
        try
        {
            ObjectContainer client1 = 
                server.openClient("localhost", 2771, "client1", "password");
            
            ObjectContainer client2 = 
                server.openClient("localhost", 2771, "client1", "password");
                
            Employee ted1 = (Employee)
                client1.get(
                    new Employee("Ted", "Neward", null, null, 0, null))
                .next();
            System.out.println("client1 found ted: " + ted1);
                
            Employee ted2 = (Employee)
                client2.get(
                    new Employee("Ted", "Neward", null, null, 0, null))
                .next();
            System.out.println("client2 found ted: " + ted2);
            
            ted1.setTitle("Lord and Most High Guru");
            client1.set(ted1);
            System.out.println("set(ted1)");

            System.out.println("client1 found ted1: " +
                client1.get(
                    new Employee("Ted", "Neward", null, null, 0, null))
                .next());

            System.out.println("client2 found ted2: " + 
                client2.get(
                    new Employee("Ted", "Neward", null, null, 0, null))
                .next());
                
            client1.commit();
            System.out.println("client1.commit()");

            System.out.println("client1 found ted1: " +
                client1.get(
                    new Employee("Ted", "Neward", null, null, 0, null))
                .next());

            System.out.println("client2 found ted2: " + 
                client2.get(
                    new Employee("Ted", "Neward", null, null, 0, null))
                .next());
                
            client2.ext().refresh(ted2, 1);
            System.out.println("After client2.refresh()");
            System.out.println("client2 found ted2: " + 
                client2.get(
                    new Employee("Ted", "Neward", null, null, 0, null))
                .next());
            
            client1.close();
            client2.close();
        }
        finally
        {
            server.close();
        }
    }

Defining access control in db4o consists of nothing more than the grantAccess() method, and it grants access to the entire database instance. This is both a good and a bad thing because it simplifies the security setup scenario but also leaves you unable to practice the Principle of Least Privilege.

The Principle of Least Privilege

The Principle of Least Privilege, like so many security ideas, is a simple one in theory but sometimes tricky to implement in practice. The principle states that a given user or body of code should be given only as much privilege as necessary to carry out its assigned tasks, nothing more (and obviously nothing less). So, for example, in an RDBMS scenario, the code that accesses the RDBMS should have only the basic SELECT/INSERT/UPDATE/DELETE permissions on only those tables it accesses, and nothing more. That way, if the code falls prey to an SQL-injection attack, the code won't be able to execute the injection attack because of the restricted database access.

Addressing this at a more granular level currently isn't an option in db4o, though other OODBMS systems may be more flexible. For now, you should ensure that only the minimum number of security credentials are used. If you can't restrict the resources a login gets to use, then you can at least restrict the number of principals accessing the system.

Encryption formats

Security concerns in distributed systems — most notably, the Fourth Fallacy of Enterprise Computing (see Effective Enterprise Java in Resources) — state that you cannot trust that the only ones listening to the traffic on the network are trusted individuals or code. That means that you have to ensure that the data traveling across the network isn't being sent in clear text or binary form (even if it is in binary form, if the format is well-known, it's the same as clear text).

Developers concerned about security will also be concerned about the data stored in the db4o file because said file is "just a file" and thus open to attack by opening the file and reading the contents. (This same concern exists for relational databases!)

The solution is to encrypt the file, which in db4o is relatively straightforward. For most scenarios, db4o's default encryption scheme, the eXtended Tiny Encryption Algorithm (XTEA) does a passable job of obfuscating data to the casual attacker. For other instances, db4o provides a custom encryption "hook" that enables you to make use of third-party encryption providers. (Don't define your own encryption format unless you can write a paper challenging the established standards, have it work-shopped by a collection of the world's finest cryptographers and mathematicians, and defend it in oral review at a major security conference. Once you've done all that, you might think about using it; after all, just because all of the above hasn't revealed a flow doesn't mean one doesn't exist after all.)

You can't see me — or my data!

Protecting wire-format data transmission is decidedly trickier because db4o up through 6.3 has no facility for communicating across a secured transport line (like SSL via a SecureSocket). This means that any sensitive data must be transmitted in an encrypted fashion, which implies some form of encryption inside the object itself. (It would be nicer if the db4o system implemented it directly; as of this writing, the db4o 6.4 release has plans to support passing in a SocketFactory to the openServer call, so you can use SecureSocket connections instead of wide-open Socket connections.)

Note that you can secure data across the transport in a "cross-cutting" fashion using custom marshalers. As the name implies, this gives you the ability to control how data is packed (and unpacked) for its trip across the wire. Doing so looks remarkably similar to what we see in custom serialization via the Externalizable interface: write a class that implements the ObjectMarshaller interface, implement a readFields() and writeFields() method, then tell the db4o system to use the custom marshaller for a particular class of objects by calling marshallWith() on the ObjectClass of the desired class. Here's the whole thing:

Db4o.configure().objectClass(Item.class).marshallWith(customMarshaller);

Doing this will not secure the entire line — attackers may still be able to see the kinds of objects being preserved — but it will prevent the data from being seen as it crosses from node to node in the network.

For those scenarios where the db4o choice of "embedded versus client/server" is insufficient to meet your needs, such as storing the data to a particular file format or non-traditional data storage resource, the db4o library allows you to create a subclass of IoAdaptor, which is the key abstraction to which db4o writes its data when storing objects. This permits a degree of flexibility in storage not seen in most RDBMS systems.


All good things ...

There remains a great deal more to be said and explored about the OODBMs and db4o, but I've done what I set out to do and have decided to end this series for now. I believe I've made a good case for db4o and object-oriented data management and introduced all the features that make an OODBMS different from an RDBMS from a Java developer's perspective. I've shown how db4o can easily track associations, how it captures inheritance as a first-class concept within the database, and how it simplifies the retrieval of those objects using the native programming language that defined those objects.

I've also tried to point out the shortcomings of the OODBMS and where db4o runs into the same concerns an RDBMSs would, such as the performance challenges associated with handling "round-trips" across a client-server network.

If you've been following the examples and experimenting with the code in this series, you have built the basic skill-set necessary for using any OODBMS system, not just db4o. It might be a good exercise to try applying what you've learned to Cache' or Versant. Most OODBMSs follow the same basic coding conventions and idiomatic expressions, and in fact db4o's support for native queries has spawned an effort to standardize that feature as part of all OODBMSs.

Hopefully you found what you were looking for in this series, and you're ready to start using an OODBMS in your own projects. It can be a remarkably liberating experience, not having to worry about relational schema and everything associated with them. So, loosen up a bit: experiment, implement, have fun while you're at it, and don't forget to drop me a postcard with your experiences. (Oh, fine, be that way, e-mail is okay, too.)

Resources

Learn

Get products and technologies

  • Download db4o: An open source native Java programming and .NET database.

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Java technology on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Java technology
ArticleID=275808
ArticleTitle=The busy Java developer's guide to db4o: Transactions, distribution, and security
publish-date=12112007