Java development 2.0: Redis for the real world

How Redis beats memcached in heavily read applications

Redis has a lot in common with memcached but it boasts a richer set of features. In this month's Java development 2.0, Andrew experiments with adding Redis (by way of Java™-based variant Jedis) to his location-based mobile application. Learn how Redis works as a simple data store, then try repurposing it for ultra-fast, lightweight caching.

Share:

Andrew Glover, CTO, App47

 Andrew GloverAndrew Glover is a developer, author, speaker, and entrepreneur with a passion for behavior-driven development, Continuous Integration, and Agile software development. He is the founder of the easyb Behavior-Driven Development (BDD) framework and is the co-author of three books: Continuous Integration, Groovy in Action, and Java Testing Patterns. You can keep up with him at his blog and by following him on Twitter.



13 December 2011

Also available in Chinese Russian Japanese Portuguese

About this series

The Java development landscape has changed radically since Java technology first emerged. Thanks to mature open source frameworks and reliable for-rent deployment infrastructures, it's now possible to assemble, test, run, and maintain Java applications quickly and inexpensively. In this series, Andrew Glover explores the spectrum of technologies and tools that make this new Java development paradigm possible.

I've discussed the concept of NoSQL before in this series and introduced a variety of NoSQL data stores that are compatible with the Java platform, including Google's Bigtable and Amazon’s SimpleDB. I've also discussed more conventional server-based data stores like MongoDB and CouchDB. Every data store has strengths and weaknesses, especially as applied to a particular domain scenario.

This month's Java development 2.0 spotlight is on Redis, a lightweight key-value data store. Most NoSQL implementations are essentially key-value, but Redis supports an unusually rich set of values, including strings, lists, sets, and hashes. As such, Redis is often labeled a data structure server. Redis also has a reputation for being exceptionally fast, which makes it an optimum choice for a certain class of use cases.

When trying to understand something new, it can be helpful to compare it to something you are already familiar with, so we'll start our exploration of Redis by considering its similarity to memcached. I'll then demonstrate key features of Redis that could give it an edge over memcached in some application scenarios. And finally, I'll show you how to use Redis as a traditional datastore for model objects.

Redis and memcached

Memcached is a well-known, in-memory object caching system that works by putting a target key and value into a memory cache. Memcached thus sidesteps the I/O cost that happens when a read hits the disk. Sticking memcached between a web application and a database can yield better read performance. Memcached is, therefore, a good choice for applications that require fast data look ups. One example would be a stock look-up service that would otherwise hit a database for fairly static data, such as ticker-to-name or even pricing information.

MemcacheDB

Comparing Redis to memcached isn't exactly fair; far better to stack it up against MemcacheDB, which is a distributed key-value storage system designed for data persistence. MemcacheDB is pretty similar to Redis, and it has the added advantage of effortlessly communicating with client implementations of memcached.

But memcached has some limitations, including the fact that all its values are simple strings. Redis, as an alternative to memcached, supports a richer feature set. Some benchmarks also indicate that Redis is much faster than memcached. Redis's rich data types make it possible to store far more sophisticated data in memory than you could with memcached. And unlike memcached, Redis can persist its data.

Redis makes a great caching solution, but its rich feature set leads to other uses. Because Redis is capable of storing data on disk and replicating data across nodes, it can be leveraged as a data repository for traditional data models (that is, you can use Redis much like you would an RDBMS). Redis is also often employed as a queuing system. In this use case, Redis is the basis of a backing, persistent store of work queues that leverage Redis's list type. GitHub is one example of a large-scale infrastructure that uses Redis this way.


Get Redis and go!

In order to get started with Redis, you'll have to get access to it, which you can do via a local install or a hosted provider. If you're on a Mac, the install process couldn't be easier. If you're using Windows®, you'll need to have Cygwin installed. If you're looking at hosted providers, Redis4You has a free plan. Regardless of how you access Redis, you will be able to follow the examples later in the article. I should point out, though, that using a hosted Redis provider for caching might not be a great caching solution, because network latency could undo any performance gains.

You interact with Redis via commands, meaning that there is no SQL-like query language. Working with Redis is very much like working with a traditional map data structure — everything has a key and a value, and each value has a rich set of data types associated with it. Every data type also has its own set of commands. For instance, if you planned on using simple data types, say in some sort of caching scheme, you could use the commands set and get.

You can interact with an instance of Redis via a command-line shell. There also are multiple client implementations for programmatically working with Redis. Listing 1 shows a simple command-line shell interaction using basic commands:

Listing 1. Using basic Redis commands
redis 127.0.0.1:6379> set page registration
OK
redis 127.0.0.1:6379> keys *
1) "foo"
2) "page"
redis 127.0.0.1:6379> get page
"registration"

Here, I've associated the key "page" with the value "registration" via the set command. Next, I've issued the keys command (the trailing * signifies that I want to see all instance keys available). The keys command shows that there is a page key as well as a foo one — I can retrieve the value associated with a key via the get command. Keep in mind that the value retrieved from a get can only be a string. If a key's value is a list, for example, you must use a list-specific command to retrieve the list's elements. (Note that there are commands to query a value's type.)


Java integration with Jedis

For programmers wanting to integrate Redis into Java applications, the Redis team recommends a project called Jedis. Jedis is a lightweight library that maps native Redis commands to simple Java methods. For instance, Jedis lets me get and set simple values like in Listing 2:

Listing 2. Basic Redis commands in Java code
JedisPool pool = new JedisPool(new JedisPoolConfig(), "localhost");
Jedis jedis = pool.getResource();

jedis.set("foo", "bar");
String foobar = jedis.get("foo");
assert foobar.equals("bar");

pool.returnResource(jedis);
pool.destroy();

In Listing 2, I configure a connection pool and grab a connection, (much like you would in a typical JDBC scenario), which I then return at the bottom of the listing. Between the connection-pool logic, I set the value "bar" with the key "foo", which I retrieve via the get command.

Similar to memcached, Redis allows you to associate an expiration time to a value. So I can set a value (say a stock's temporary trading price) that eventually will be purged from the Redis cache. If I want to set an expiration time in Jedis, I do it after issuing my set call, by associating it with an expire time, as shown in Listing 3:

Listing 3. A Redis value can be set to expire
jedis.set("gone", "daddy, gone");
jedis.expire("gone", 10);
String there = jedis.get("gone");
assert there.equals("daddy, gone");

Thread.sleep(4500);

String notThere = jedis.get("gone");
assert notThere == null;

In Listing 3, I've used an expire call to set the value of "gone" to expire in 10 seconds. After Thread.sleep has been invoked, a get for "gone" will return null.

Data types in Redis

Working with Redis data types such as lists and hashes requires specialized command usage. For instance, I can create lists by appending values to a key. In the code in Listing 4, I issue an rpush command, which appends a value to the right or tail of a list. (A corresponding lpush command prepends a value to the front of a list.)

Listing 4. Redis lists
jedis.rpush("people", "Mary");
assert jedis.lindex("people", 0).equals("Mary");

jedis.rpush("people", "Mark");

assert jedis.llen("people") == 2;
assert jedis.lindex("people", 1).equals("Mark");

Redis supports a wide variety of commands for working with data types; moreover, each data type has its own set of commands. Rather than going over them individually, I'll show you some of them at work in a realistic application development scenario.


Redis as a caching solution

I've mentioned that Redis is easily employed as a caching solution, and it just happens that I have need of one of those! In this application example, I'm going to integrate Redis with my location-based mobile web service, called Magnus.

If you haven't been following this series, I first implemented Magnus using the Play framework, and I've developed or refactored it in various implementations since then. Magnus is a simple service that takes JSON documents via HTTP PUT requests. These documents describe the location of a particular account, which means a person holding a mobile device.

Now I want to integrate caching into Magnus — that is, I want to reduce I/O traffic in the form of a look-up by storing, in memory, data that doesn't often change.

Magnus caches!

My first step in Listing 5 will be to find out if an incoming account name (which is the key) is in Redis via a get call. A call to get will either return the account ID as a value or it will return null. If a value is returned, I'll use that as my acctId variable. If null is returned (indicating that the account's name isn't in Redis as a key), then I'll look up the account value in MongoDB and add it to Redis via a set command.

The advantage here is speed: The next time a requested account submits a location, I will be able to obtain its ID from Redis (acting as an in-memory cache) rather than having to go to MongoDB and incur a read I/O cost.

Listing 5. Using Redis as an in-memory cache
"/location/:account" {
  put {
    def jacksonMapper = new ObjectMapper()
    def json = jacksonMapper.readValue(request.contentText, Map.class)
    def formatter = new SimpleDateFormat("dd-MM-yyyy HH:mm")
    def dt = formatter.parse(json['timestamp'])
    def res = [:]
    
    try{

      def jedis = pool.getResource()	
      def acctId = jedis.get(request.parameters['account'])

      if(!acctId){
        def acct = Account.findByName(request.parameters['account'])
        jedis.set(request.parameters['account'], acct.id.toString())
        acctId = acct.id
      }

      pool.returnResource(jedis)
      new Location(acctId.toString(), dt, json['latitude'].doubleValue(), 
      json['longitude'].doubleValue() ).save()
      res['status'] = 'success'
    }catch(exp){
      res['status'] = "error ${exp.message}"
    }
   response.json = jacksonMapper.writeValueAsString(res)
  }
}

Note that the aMagnus implementation (written in Groovy) in Listing 5 still uses a NoSQL implementation for data model storage; it just uses Redis as a cache implementation for look-up data. Because my primary account data lives in MongoDB (in fact, it resides at MongoHQ.com) and my Redis data store runs locally, Magnus will get a significant speed boost when looking up subsequent account IDs.

But wait! Why do I need both MongoDB and Redis? Can't I get away with using just one?

Node.js for ORM

A number of projects provide an ORM-like mapping for Redis, including a highly influential Ruby-based alternative called Ohm. I checked out a Java-based derivative of that project (called JOhm) but eventually settled on using a variation written for Node. The beauty of Ohm and its derivative projects is that they allow you to map an object model into a Redis-based data structure. Thus, your model objects are both persistent and (in most cases) extremely fast in read situations.

Using Nohm, I was able to quickly rewrite my Magnus app in JavaScript and persist Location objects in a snap. In Listing 6, I've defined a Location model that includes three properties. (Note that I've kept my example simple by making timestamp a string rather than a true timestamp.)

Listing 6. Redis ORM in Node.js
var Location = nohm.model('Location', {
	properties: {
	    latitude: {
	      type: 'float',
	      unique: false,
	      validations: [
	        ['notEmpty']
	      ]
	    },
		longitude: {
	      type: 'float',
	      unique: false,
	      validations: [
	        ['notEmpty']
	      ]
	    },
		timestamp: {
	      type: 'string',
	      unique: false,
	      validations: [
	        ['notEmpty']
	      ]
        }
     }
});

Node's Express framework makes using my new Nohm Location object really easy. In my application's PUT implementation, I grab the incoming JSON values and put them into an instance of Location, via Nohm's p call. I then check to see whether the instance is valid. If it is, I persist it.

Listing 7. Using Nohm in Node's Express.js
app.put('/', function(req, res) {
  res.contentType('json');
	
  var location = new Location;
  location.p("timestamp", req.body.timestamp);
  location.p("latitude", req.body.latitude);
  location.p("longitude", req.body.longitude);
  
  if(location.valid()){	
  	location.save(function (err) {
	  	if (!err) {
		    res.send(JSON.stringify({ status: "success" }));
		  } else {		
		   res.send(JSON.stringify({ status: location.errors }));
		  }
	  });
  }else{
   res.send(JSON.stringify({ status: location.errors }));
  }
});

As Listing 7 shows, Redis pretty easily steps up to being an in-memory, blazingly fast datastore. And in some cases, it might even be a better cache than memcached!


In conclusion

Redis is useful for a wide variety of data storage scenarios, and because it can persist data to disk (and because it supports a rich data set), it's sometimes a worthy competitor to memcached. In cases where it makes sense for your domain, you can use Redis as a backing store for data models and queues. Redis client implementations have been ported to just about every programming language there is.

Redis isn't a total replacement for an RDMBS, nor is it a heavyweight store, rich with query features like MongoDB. In many cases, it can live side-by-side with these technologies, however. As I've shown in this article, Redis can be a good stand-alone data storage solution for applications that run heavy on data lookups, or where realtime statistics could be done via Redis's speedy atomic operations.

Resources

Learn

Get products and technologies

  • Download Redis and Jedis: Redis is an open source key-value store and data structure server; Jedis is the current recommended client for Java-based development.
  • Get Nohm: A Node.js implementation of the Redis object-relational mapper, Ohm.

Discuss

  • Get involved in the My developerWorks community. Connect with other developerWorks users while exploring the developer-driven blogs, forums, groups, and wikis.

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Java technology on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Java technology, Cloud computing
ArticleID=780065
ArticleTitle=Java development 2.0: Redis for the real world
publish-date=12132011