Java development 2.0: Cloud storage with Amazon's SimpleDB, Part 2

Plain old object persistence with SimpleJPA

Modeling domain objects for almost any type of application is a breeze using a relational framework like Grails, but what about SimpleDB? In this second half of his introduction to SimpleDB, Andrew Glover shows you how to use SimpleJPA, rather than the Amazon SDK, to persist objects in SimpleDB's cloud storage. In addition to letting you use plain old Java™ objects for domain modeling (a la JPA), SimpleJPA automatically converts primitive data types into Amazon-friendly strings. You really couldn't ask for a much simpler approach to cloud storage.

Share:

Andrew Glover, Author and developer, Beacon50

Andrew GloverAndrew Glover is a developer, author, speaker, and entrepreneur with a passion for behavior-driven development, Continuous Integration, and Agile software development. He is the founder of the easyb Behavior-Driven Development (BDD) framework and is the co-author of three books: Continuous Integration, Groovy in Action, and Java Testing Patterns. You can keep up with Andrew by reading his blog and by following him on Twitter.



03 August 2010

Also available in Chinese Russian Japanese Vietnamese Portuguese

About this series

The Java development landscape has changed radically since Java technology first emerged. Thanks to mature open source frameworks and reliable for-rent deployment infrastructures, it's now possible to assemble, test, run, and maintain Java applications quickly and inexpensively. In this series, Andrew Glover explores the spectrum of technologies and tools that make this new Java development paradigm possible.

In the first half of this introduction to SimpleDB, I showed you how to leverage Amazon's own API to model a CRUD-style racing application. Aside from the obvious uniqueness for most Java developers of Amazon's string-only approach to data types, you might have found yourself looking at the Amazon API with some skepticism. After all, the APIs for leveraging a relational database are by now fairly standard and well-thought-out — and perhaps more important, they are familiar.

Behind the scenes, many relational frameworks today implement the Java Persistence API. This makes modeling domain objects for almost any type of Java application both easy and familiar across the range of RDBMSs. It's natural to be resistant to learning a new approach to domain modeling when you've already mastered one that works — and the good news is that with SimpleDB, you don't have to.

In this second half of my introduction to SimpleDB, I'll show you how to refactor the racing application from Part 1 to be compliant with the JPA specification. Then we'll port the application to SimpleJPA and discover some of the ways that this innovative, open source platform can make adapting to NoSQL domain modeling, and cloud-based storage, a little easier.

Why SimpleDB?

Amazon's SimpleDB is a simple, yet massively scalable and reliable cloud-based datastore. Due to its non-relational/NoSQL foundation, SimpleDB is both flexible and lightning fast. As part of the Amazon Web services family, SimpleDB uses HTTP as its underlying communication mechanism, so it's able to support language bindings ranging from the Java language to Ruby, C#, and Perl. SimpleDB is also inexpensive: Under SimpleDB's licensing, you pay only for the resources you consume, which differs from the more traditional method of buying a license up front for predicted usage and space. As part of the emerging class of NoSQL, or non-relational, datastores, SimpleDB is related to Google's Bigtable or CouchDB, also profiled in this series.

Hibernate and JPA: A brief history

Numerous Java developers today leverage Hibernate (and Spring) for data persistence. In addition to being a bellwether of open source success, Hibernate has changed the field of ORM for good. Before we had Hibernate, Java developers had to deal with the quagmire of EJB entity beans; before that, we basically rolled our own ORMs or bought one from a vendor like IBM®. Hibernate washed away all that complexity and cost in favor of the POJO-based modeling platform many of us take for granted today.

The Java Persistence API (JPA) was created in response to the popularity of Hibernate's innovation of leveraging POJOs for data modeling. Today, EJB 3.0 implements JPA, and so does Google App Engine. Even Hibernate itself is a JPA implementation, assuming you use the Hibernate EntityManager.

Given how comfortable Java developers have become with modeling data-centric applications using POJOs, it makes sense that a datastore like SimpleDB should give us a similar option. After all, it's kind of like a database, isn't it?


Data modeling with objects

In order to use SimpleJPA, we need to do a little work on our Racer and Runner objects, bringing them up to speed with the JPA specification. Fortunately, the basics of JPA are pretty simple: you decorate normal POJOs with annotations and an EntityManager implementation takes care of the rest — no XML required.

Two of the main annotations used by JPA are @Entity and @Id, which specify a POJO as persistent and delineate its identity key, respectively. For the purpose of converting our racing application to JPA, we'll also need two annotations used for managing relationships: @OneToMany and @ManyToOne.

In the first half of this article, I showed you how to persist runners and races. I never made use of any objects to represent those entities, however — I just used Amazon's raw API to persist the properties of both. If I wanted to model a simple relationship between a race and its runners, I could do so as shown in Listing 1:

Listing 1. A simple Race object
public class Race {
 private String name;
 private String location;
 private double distance;
 private List<Runner> runners;
	
 //setters and getters left out...
}

In Listing 1, I've specified a Race object with four properties, the last of which is a Collection of runners. Next, I can create a simple Runner object (shown in Listing 2) that holds each runner's name (I'll keep it really simple for now) and SSN along with the Race instance she or he is racing in.

Listing 2. A simple Runner related to a Race
public class Runner  {
 private String name;
 private String ssn;
 private Race race;

 //setters and getters left out...
}

As you can see in Listings 1 and 2, I've logically modeled a many-to-one relationship between runners and a race. In a real-world situation, it would probably be more appropriate to make the link many-to-many (don't runners usually run more than one race?), but I'm keeping it easy. I've also left out the constructors, setters, and getters for now. I'll show them to you later.

Annotations in JPA

Getting these two objects ready for SimpleJPA isn't terribly challenging. First, I have to signify my intent to make them persistable by adding the @Entity annotation to each one. I also need to delineate the relationships properly using @OneToMany in the Race object and @ManyToOne in the Runner object.

The @Entity annotation is attached at the class level and the relationship annotations are attached at the getter level. All of this is demonstrated in Listings 3 and 4:

Listing 3. A JPA-annotated Race
@Entity
public class Race {
 private String name;
 private String location;
 private double distance;
 private List<Runner> runners;

 @OneToMany(mappedBy = "race")
 public List<Runner> getRunners() {
  return runners;
 }

 //other setters and getters left out...
}

In Listing 3, I've decorated the getRunners method with a @OneToMany annotation. I've also specified that the relationship can be found using the race property on the entity Runner.

In Listing 4, I'll similarly annotate the getRace method in the Runner object.

Listing 4. A JPA-annotated Runner
@Entity
public class Runner  {
 private String name;
 private String ssn;
 private Race race;

 @ManyToOne
 public Race getRace() {
  return race;
 }

 //other setters and getters left out...
}

Most datastores (relational or not) need some way of delineating uniqueness among data. So if I want to make these two objects persistent in a datastore, I at least have to add IDs to them. In Listing 5, I've added an id property of type BigInteger to the Race domain object. I will also do the same for Runner.

Listing 5. Adding an ID to Race
@Entity
public class Race {
 private String name;
 private String location;
 private double distance;
 private List<Runner> runners;
 private BigInteger id;

 @Id
 public BigInteger getId() {
  return id;
 }

 @OneToMany(mappedBy = "race")
 public List<Runner> getRunners() {
  return runners;
 }

 //other setters and getters left out...
}

The @Id annotation in Listing 5 doesn't provide any information about how the ID is managed. The program will assume I'm doing that manually, rather than using an EntityManager, for example.


Enter SimpleJPA

So far, I haven't done anything specific to SimpleDB. The Race and Runner objects are generically annotated with JPA annotations and could be persisted in any datastore that is supported by a JPA implementation. Options include Oracle, DB2, MySQL, and (as you've probably guessed by now) SimpleDB.

SimpleJPA is an open source implementation of JPA for Amazon's SimpleDB. While it doesn't support the entire JPA specification (for example, you can't join in JPA queries), it supports a large enough subset to be worth exploring.

A big advantage of using SimpleJPA is that it attempts to seamlessly handle the lexicographic issues I discussed in the first half of this article. SimpleJPA does the string conversion and any subsequent padding (if required) for objects that rely on numeric types. For the most part, this means that you don't have to change your domain model to reflect String types. (There is one exception to that rule, which I'll explain in a moment.)

Because SimpleJPA is a JPA implementation, you can easily use JPA-compliant domain objects with it. SimpleJPA only requires that you use String IDs, meaning that your id property must be a java.lang.String. To make things easier, SimpleJPA provides the base class IdedTimestampedBase, which manages the domain object's ID property, as well as the date attributes created and updated. (Under the hood, SimpleDB generates a unique Id.)


Porting the app to SimpleJPA

To make the Race and Runner classes compliant with SimpleJPA, I could either extend SimpleJPA's handy base class or change each class's id property from BigInteger to String. I've opted for the first option, as shown in Listing 6:

Listing 6. Changing Race to use SimpleJPA's IdedTimestampedBase base class
@Entity
public class Race extends IdedTimestampedBase{
 private String name;
 private String location;
 private double distance;
 private List<Runner> runners;

 @OneToMany(mappedBy = "race")
 public List<Runner> getRunners() {
  return runners;
 }

 //other setters and getters left out...
}

I won't show you the same code for Runner, but feel free to run through it on your own: just extend IdedTimestampedBase and remove the id property from the Runner.

Updating the IDs for Race and Runner is the first step of making the racing application compliant with SimpleJPA. Next, I need to exchange primitive datatypes (like double, int, and float) for objects like Integer and BigDecimal.

I'll start with the distance property of Race. I've found BigDecimal to be more reliable than Double (in the current release of SimpleJPA), so I changed Race's distance property to a BigDecimal, as shown in Listing 7:

Listing 7. Changing distance to BigDecimal
@Entity
public class Race extends IdedTimestampedBase{
 private String name;
 private String location;
 private BigDecimal distance;
 private List<Runner> runners;

 @OneToMany(mappedBy = "race")
 public List<Runner> getRunners() {
  return runners;
 }

 //other setters and getters left out...
}

Now both Runner and Race are ready to be persisted via a SimpleJPA implementation.


Using SimpleJPA with SimpleDB

Manipulating your domain objects against SimpleDB with SimpleJPA isn't any different from going against a normal relational database with a JPA implementation. If you've ever done any application development with JPA, then nothing about it should surprise you. The only thing that might be new is configuring SimpleJPA's EntityManagerFactoryImpl, which will require your Amazon Web Services credentials and the prefix name for your SimpleDB domain. (Another option would be to provide a properties file containing your credentials on the classpath.)

Using the prefix name you specify when creating an instance of SimpleJPA's EntityManagerFactoryImpl will result in SimpleDB domains that start with your prefix followed by a dash, and then your domain object's name. So, if I specify "b50" for my prefix, then when I create a Race item in SimpleDB, the domain will be "b50-Race".

Once you've created an instance of SimpleDB's EntityManagerFactoryImpl, everything else is driven by the interface. You'll need an EntityManager instance, which you obtain from the EntityManagerFactoryImpl, as shown in Listing 8:

Listing 8. Obtaining an EntityManager
Map<String,String> props = new HashMap<String,String>();
props.put("accessKey","...");
props.put("secretKey","..");

EntityManagerFactoryImpl factory = 
  new EntityManagerFactoryImpl("b50", props);

EntityManager em = factory.createEntityManager();

Manipulating domain objects

Once you have a handle to an EntityManager, you can manipulate domain objects at will. For instance, I can create a Race instance like so:

Listing 9. Creating a Race
Race race = new Race();
race.setName("Charlottesville Marathon");
race.setLocation("Charlottesville, VA");
race.setDistance(new BigDecimal(26.2));
em.persist(race);

In Listing 9, SimpleJPA handles all the HTTP requests to create Race in the cloud. Using SimpleJPA means that I can also retrieve the race using a JPA query, as shown in Listing 10. (Remember that you cannot do joins with these queries, but I can still search with numbers.)

Listing 10. Finding a race by distance
Query query = em.createQuery("select o from Race o where o.distance = :dist");
query.setParameter("dist", new BigDecimal(26.2));
		
List<Race> races = query.getResultList();
for(Race race : races){
 System.out.println(race);
}

From numbers to strings

SimpleJPA's under-the-hood number-to-string magic is especially nice; for instance, if you enable query printing in SimpleJPA, you can see what queries it issues to SimpleDB. The query submitted is shown in Listing 11. Note how distance is encoded.

Listing 11. SimpleJPA handles numbers nicely!
amazonQuery: Domain=b50-Race, query=select * from `b50-Race` 
  where `distance` = '0922337203685477583419999999999999928946'

Automatic padding and encoding makes things a lot easier, don't you think?


Relationships in SimpleJPA

Even though SimpleDB doesn't permit domain joins in queries, you can still have related items across domains. Like I showed you in Part 1, you can simply store the key of a related object in another and then retrieve that object when you need it. That's what SimpleJPA does, too. For instance, earlier I showed you how to link Runners to a Race using JPA annotations. Thus, I can create an instance of a Runner, add the existing race instance to it, and then persist the Runner instance, as shown in Listing 12:

Listing 12. Relationships with SimpleJPA
Runner runner = new Runner();
runner.setName("Mark Smith");
runner.setSsn("555-55-5555");
runner.setRace(race);
race.addRunner(runner);
		
em.persist(runner);
em.persist(race); //update the race now that it has a runner

Also note from Listing 12 that I have to update the Race instance to reflect the fact that I added a Runner instance to it (also note, I added a addRunner method to Race that simply adds a Runner to the internal Collection of Runners).

Once again, if I search for a race by its distance, I can also get a listing of its runners like Listing 13:

Listing 13. More relationship fun!
Query query = em.createQuery("select o from Race o where o.distance = :dist");
query.setParameter("dist", new BigDecimal(26.2));
		
List<Race> races = query.getResultList();
		
for(Race races : race){
 System.out.println(race);
 List<Runner> runners = race.getRunners();
 for(Runner rnr : runners){
  System.out.println(rnr);
 }
}

Using an EntityManager instance enables me to delete entities via the remove method, shown in Listing 14:

Listing 14. Removing a class instance
Query query = em.createQuery("select o from Race o where o.distance = :dist");
query.setParameter("dist", new BigDecimal(26.2));
		
List<Race> races = query.getResultList();
		
for(Race races : race){
 em.remove(race);
}

While I've removed a Race instance in Listing 14, any related Runners aren't removed. (I, of course, can handle this by using a JPA's EntityListeners annotation, which means I can hook into a removal event and use it to remove Runner instances.)


In conclusion

This whirlwind tour of SimpleDB has shown you how to manipulate objects in the non-relational datastore using both the Amazon Web services API and SimpleJPA. Simple JPA implements a subset of the Java Persistence API to make object persistence in SimpleDB easier. One of the conveniences of using SimpleJPA, as you've seen, is that it automatically converts primitive types to the string objects that SimpleDB recognizes. SimpleJPA also handles SimpleDB's no-join rules for you automatically, making it easier to model relationships. SimpleJPA's extensive listener interfaces also make it possible to implement logical data integrity rules, which you've probably come to expect from the relational world.

The bottom line on SimpleJPA is that it can help you access significant, inexpensive scalability quickly and easily. With SimpleJPA, you can leverage the knowledge you already have from years of working with frameworks like Hibernate in a non-relational, cloud-based storage environment.

Resources

Learn

Get products and technologies

  • SimpleJPA: A Java Persistence API (JPA) implementation for Amazon's SimpleDB. In other words, an object-relational mapping (ORM) framework for Amazon's database in the cloud.

Discuss

  • Get involved in the My developerWorks community. Connect with other developerWorks users while exploring the developer-driven blogs, forums, groups, and wikis.

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Java technology on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Java technology, Cloud computing
ArticleID=504309
ArticleTitle=Java development 2.0: Cloud storage with Amazon's SimpleDB, Part 2
publish-date=08032010