Java development 2.0: Play-ing with Amazon RDS

Relational data management as a service? Why not?

Amazon's Relational Database Service (RDS) offloads the work of maintaining a database to Amazon Web Services, which makes it exceptionally easy to increase or swap out your application's data storage. This month, Andrew Glover revisits his location-based cloud-to-mobile application, swapping the original NoSQL datastore for a traditional RDBMS. It's a breeze using the Play framework and the AWS console.

Share:

Andrew Glover, Author and developer, Beacon50

Andrew GloverAndrew Glover is a developer, author, speaker, and entrepreneur with a passion for behavior-driven development, Continuous Integration, and Agile software development. He is the founder of the easyb Behavior-Driven Development (BDD) framework and is the co-author of three books: Continuous Integration, Groovy in Action, and Java Testing Patterns. You can keep up with him at his blog and by following him on Twitter.



19 July 2011

Also available in Chinese Russian Japanese Portuguese

About this series

The Java development landscape has changed radically since Java technology first emerged. Thanks to mature open source frameworks and reliable for-rent deployment infrastructures, it's now possible to assemble, test, run, and maintain Java applications quickly and inexpensively. In this series, Andrew Glover explores the spectrum of technologies and tools that make this new Java development paradigm possible.

A couple of months ago in this series, I gave you a hands-on introduction to Amazon’s Elastic Beanstalk, a platform-as-a-service (PaaS) made for Java application development. As I demonstrated, Beanstalk is extremely versatile, permitting developers to use almost any combination of tools in order to get the job done. In addition to doing rapid web development with Play, I was able to use another PaaS, MongoHQ, to manage my MongoDB instances. Combining the two PaaS infrastructures meant that most of the application's maintenance was handled for me, so I could focus on building a terrific cloud-to-mobile app.

The MongoHQ PaaS worked out great for that project, but what if I preferred to store my data in a traditional RDBMS? After all, most Java developers are more comfortable with coding to a relational database. It's also true that not every project can afford to forego ACID, as many NoSQL datastores require.

DB2 pureScale

Scaling relational databases on-demand is by no means new. IBM's DB2 pureScale, which in many ways is like RDS in that you can auto-scale distributed instances of DB2 while receiving a high degree of reliability of your data. pureScale enables rapid data growth through unlimited capacity, continuous availability, and application transparency (that is, no changes are required to your application). Learn more about pureScale in Resources.

This month, I'll get you started with Amazon Relational Database Service (RDS), another excellent and versatile addition to the Amazon Web Services family. Using Amazon RDS is about as easy as running MongoDB instances hosted by MongoHQ. Its relationalness isn't at all hard to manage, although it will present some interesting scaling issues. We'll just need to spend a few cycles up front defining a schema, and then we'll be good to go.

Amazon RDS

Amazon Relational Database Service is a PaaS that offers an on-demand, cloud-based, scalable MySQL instance for application development. If RDS was just an instance of MySQL running in the cloud, however, I wouldn't be writing about it here. After all, I explained how to leverage an Amazon EC2 image running MySQL back in 2009 (see Resources).

Relational scaling

Scalability is one of the primary reasons the NoSQL movement has gained momentum. One reason for NoSQL's rise is that traditional relational systems can make it difficult to scale horizontally across nodes, although efforts have been made using techniques like sharding. It's not that you can't distribute data across nodes in a relational system, it's just that doing so adds complexity and/or cost. Scaling in RDS, a relational database system, involves copying the entire database across nodes, rather than sections of it. This in turn can lead to issues with data redundancy. (See Resources for more about NoSQL datastores and RDBMS sharding.)

What makes RDS worth a look is that it is PaaS managed and operated by Amazon, offering much of the same service and flexibility found in Elastic Beanstalk. Amazon provides backups, replication, and even patches. What’s more, RDS is fully scalable — you can increase your application's storage capacity with a few clicks. Amazon also allows you to replicate RDS across availability zones, so if one zone goes down or there is a maintenance window scheduled in a zone, you still can serve data. You can even provision read-only instances of your database, ensuring increased read speed for high-volume applications or periods.

Applications already built to work with MySQL can instantly take advantage of RDS, so while the database is in the clouds, nothing else about your application will have to change. Finally, much like everything else with AWS, RDS is a pay-as-you-go model. There is no up-front cost for hardware or licenses. You pay for capacity, storage, and bandwidth as you use them.


Set up and configure RDS

AWS lets you provision various services either via the command line or via the AWS management console. I tend to use the console because it specifies the options available for various aspects of RDS. So, we'll start by selecting the RDS tab in the AWS management console and clicking the Launch Instance button.

Sign up to use RDS

Like all things in Amazon Web Services, you need to create an account in order to play. Once you've got that, you can sign up for Amazon RDS. Just remember that Amazon's services are all pay-as-you-go!

On your screen, you should see a dialog allowing you to specify MySQL database instance details like machine size, MySQL version, and how much storage is allocated to the database instance. Note that AWS has an option to deploy the database to multiple availability zones. Doing so would essentially create a cluster, so that if a particular zone goes down, others will ensure coverage.

As shown in Figure 1, you need to specify the database's schema name, admin user, and a password:

Figure 1. Set up the RDS instance
Configuring the MySQL database instance.

After you hit Continue, you are presented with another dialog that configures the database at a high-level — starting with its name, which plays a part in its JDBC URL. You can also change the port that MySQL will listen to and select the availability zone where the database will live, as in Figure 2:

Figure 2. Configure the RDS instance
A screen for selecting the database availability zones.

Next up are the management options related to how you will have AWS back up your data and when you can schedule maintenance, shown in Figure 3:

Figure 3. RDS management options
Launching and configuring database management via the AWS console.

After you click Continue and review your configuration, launch your RDS instance. This can take a few minutes to complete, so it's a good time to grab a coffee or catch up on Twitter. Once RDS is up and running, things will move quickly!

RDS security

Once your instance is live, you need to do one more thing before accessing it with your favorite SQL management tool. If you've worked with AWS before, it should come as no surprise that by default, things are somewhat locked down, so you need to explicitly allow access.

RDS's security constraints are fairly powerful, allowing you to specify a single IP or IP ranges that can communicate with RDS; however, for the purpose of this article, I'm going to make things simple and allow any IP to talk to my instance. I do this by going into the RDS DB Security Group pane and editing the CIDR/IP constraint to 0.0.0.0/0, which basically means all IPs are permitted. That step is shown in Figure 4:

Figure 4. RDS security settings
A screenshot of the RDS DB Security Group pane on AWS.

Once you've completed this step, reboot your instance of RDS. (Right-click on the instance in your AWS Management Dashboard, and you can see the option to reboot.)

If you select your running RDS instance, you'll see a Description pane (shown in Figure 5) that provides a few important details. Of most interest right now is the endpoint, which is the URL you'll use to connect to your MySQL instance.

Figure 5. The RDS dashboard
A screenshot of the RDS dashboard on the AWS console.

Now you can take your favorite database management tool (or use the command-line if you prefer) and point it at your RDS instance. I'm going to use Sequel Pro, which like most GUI-oriented tools provides a nifty interface to view tables and data, and provides a query console. To connect, you need to know your database's user name, password, and endpoint URL.


Magnus, RDBMS style

If you read my introduction to Amazon's Elastic Beanstalk (see Resources), you're familiar with Magnus, the cloud-to-mobile app I built for that article. For Magnus, I created two collections in MongoDB: an Account and a Location. (Note, though, that I could just as easily have had one collection dubbed Account with each document containing an embedded document of Locations.) Having two collections enabled me to persist account locations, which in my application scenario would have been coming in from mobile devices around the world.

I'll model the same relationship to demo Amazon RDS, but this time I'll do it the old fashioned way. An application like Magnus, with frequent location updates spread across the globe, could benefit from switching to RDS, especially given that RDS enables clustering in numerous availability zones. Remember, there are other techniques as well in RDBMS land, such as sharding, which breaks up data into logical partitions; consequently, you could shard accounts and locations by geographic location as well. Regardless, clustering, replication, and sharding each have pros and cons that should be weighed heavily before embarking on an individual strategy.

In order to introduce Magnus to SQL-land, I need to first define my relational tables, which (predictably) I'll call account and location. That's shown in Listing 1:

Listing 1. An account table
CREATE TABLE `account` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(30) DEFAULT NULL,
  PRIMARY KEY (`id`)
)

In my location table in Listing 2, I provide a foreign key back to account. In this way, I set up a one-to-many relationship between accounts and their various locations.

Listing 2. A location table
CREATE TABLE `location` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `account_id` int(11) DEFAULT NULL,
  `latitude` double DEFAULT NULL,
  `longitude` double DEFAULT NULL,
  `date_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `name` varchar(50) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `account_id` (`account_id`),
  FOREIGN KEY (`account_id`) REFERENCES `account` (`id`)
)

Once the tables are defined, I need to create them in my RDS instance. Creating tables in an RDS instance is just like creating them in a MySQL instance running on your local machine. Simply open up a terminal or point your favorite GUI at your RDS endpoint and create the tables. It's that easy.


Configuring Play

The Play web application framework (see Resources) works just as well in Amazon RDS as in Elastic Beanstalk, so I'll again use it to develop my application. Play makes it pretty easy to switch in and out of various datastores, and it supports JPA out of the box. Switching from the MongoHQ implementation to one that uses Amazon RDS requires just a few changes to my model. For one thing, Play's built-in JPA support means that I need to alter Magnus's application.conf file and point things at my RDS instance instead, as shown in Listing 3:

Listing 3. A location table
db.url=jdbc:mysql://magnus.cp3pl5vineyp.us-east-1.rds.amazonaws.com/magnus_locations
db.driver=com.mysql.jdbc.Driver
db.user=admin
db.pass=g3tf0kl

Nothing special to see here — Listing 3 is 100 percent JDBC!

Modeling relationships

As with pretty much all RDBMS ORM libraries, I'm going to model my tables in a one-to-one relationship with top-level Java objects: one Account type and one Location type. My Location will link back to Account.

Extending from Play's Model type gives me a few built-in freebies. For one thing, I get finder-like methods, as well as the typical CRUD methods save and delete.

My Account class leverages two JPA annotations. The Table annotation shown in Listing 4 is required to point to my renamed (lowercased) account table.

Listing 4. An Account type
package models;

import play.db.jpa.Model;
import javax.persistence.Entity;
import javax.persistence.Table;

@Entity
@Table(name = "account") //required otherwise table is Account
public class Account extends Model {
 public String name;
}

My Location POJO, up next in Listing 5, is slightly more involved. I need to add two additional annotations to it, to create a many-to-one relationship between Locations and Accounts.

Listing 5. A Location class defined with JPA
package models;

import play.db.jpa.Model;
import javax.persistence.*;
import java.math.BigDecimal;
import java.sql.Timestamp;

@Entity
@Table(name = "location")
public class Location extends Model {

 public BigDecimal latitude;
 public BigDecimal longitude;
 public String name;
 @Column(name = "date_time")
 public Timestamp timestamp;

 @ManyToOne
 @JoinColumn(name="account_id", nullable = false)
 public Account account;

}

Finally, in Listing 6 I update the saveLocation method of my Application controller. (Remember that in the previous Magnus implementation, I routed all HTTP PUTs to /location/ to this method.) saveLocation simply creates a new Location object (and therefore a corresponding record) and links this Location instance to an existing Account. Because I extended Play's Model object earlier, I also get a handy findById method.

Listing 6. Location-based updates with RDS
public static void saveLocation(String id, JsonObject body) throws Exception {
 String eventname = body.getAsJsonPrimitive("name").getAsString();
 double latitude = body.getAsJsonPrimitive("latitude").getAsDouble();
 double longitude = body.getAsJsonPrimitive("longitude").getAsDouble();

 Location loc = new Location();
 loc.longitude = new BigDecimal(longitude);
 loc.latitude = new BigDecimal(latitude);
 loc.name = eventname;
 loc.account = Account.findById(new Long(id));

 loc.save();

 renderJSON(getSuccessMessage());
}

Testing with RESTClient

I can use RESTClient to find out whether my updated location service is working. Just as I did with Magnus, I'll fashion some JSON documents and send them along.

Figure 6. Testing with RESTClient
A screenshot of the RESTClient interface.

When my method successfully persists a new Location record, a JSON response is sent back indicating success. Because my handy-dandy GUI lets me view the data in an RDBMS, I just do a check of my location table, and what do you know? I see new Magnus records sitting in the cloud via RDS!

Figure 7. Oh, the places you'll go!
Magnus in the cloud, as displayed on the AWS console.

In conclusion

PaaS is a friend to software development teams looking to rapidly develop and deploy web applications. In this article, I've introduced Amazon RDS, a PaaS solution that puts the relational database (in this case MySQL, though RDS supports the Oracle Database too) in the cloud. Amazon RDS is super easy to provision and acts no different from the various RDBMS systems you've probably built over the years. The important difference is that Amazon handles maintenance for you.

Resources

Learn

Get products and technologies

Discuss

  • Get involved in the developerWorks community. Connect with other developerWorks users while exploring the developer-driven blogs, forums, groups, and wikis.

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Java technology on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Java technology, Cloud computing
ArticleID=733279
ArticleTitle=Java development 2.0: Play-ing with Amazon RDS
publish-date=07192011