Crossing borders: Domain-specific languages in Active Record and Java programming

Master your domains

The Java™ programming world is full of domain-specific languages (DSLs), but options in the Java language for building DSLs are limited. Not so with Ruby. In this article, you'll learn some nifty ways Ruby lets you integrate clean DSLs, giving you a new frame of reference for examining your Java options with open eyes.

Share:

Bruce Tate (bruce.tate@j2life.com), President, RapidRed

Bruce TateBruce Tate is a father, mountain biker, and kayaker in Austin, Texas. He's the author of three best-selling Java books, including the Jolt winner Better, Faster, Lighter Java. He recently released Spring: A Developer's Notebook. He spent 13 years at IBM and is now the founder of the J2Life, LLC, consultancy, where he specializes in lightweight development strategies and architectures based on Java technology and Ruby.



04 April 2006

A DSL is a language dedicated to solving a domain-specific problem. By operating closer to the problem, a DSL can deliver benefits that you might not find in a general-purpose language. The Java world is full of DSLs. Properties files, Spring contexts, certain uses of annotations, and Ant tasks are all examples of DSLs.

As I've started to crack open other languages such as Ruby, I am starting to understand that the Java language doesn't have a very good grasp of DSLs these days. In this article, you'll see four tricks Ruby uses to integrate clean DSLs. Then, you'll see what your options might be in the Java language.

About this series

In the Crossing borders series, author Bruce Tate advances the notion that today's Java programmers are well served by learning other approaches and languages. The programming landscape has changed since Java technology was the obvious best choice for all development projects. Other frameworks are shaping the way Java frameworks are built, and the concepts you learn from other languages can inform your Java programming. The Python (or Ruby, or Smalltalk, or ... fill in the blank) code you write can change the way that you approach Java coding.

This series introduces you to programming concepts and techniques that are radically different from, but also directly applicable to, Java development. In some cases, you'll need to integrate the technology to take advantage of it. In others, you'll be able to apply the concepts directly. The individual tool isn't as important as the idea that other languages and frameworks can influence developers, frameworks, and even fundamental approaches in the Java community.

A world of hidden languages

Though you might not know it, you encounter DSLs everywhere, from your everyday life to the applications you use to the programs you write. In a courtroom, you see a stenographer use a DSL to take notes rapidly. Music uses several different notations to describe the volume, pitch, and duration of each note, in a format that's friendly for a given instrument. (I use guitar tablature, which has a line for every string on my guitar.) You use DSLs because they solve problems more effectively than the spoken or written word.

You also use DSLs when you use everyday applications. The best example is a spreadsheet. It's easier to write a spreadsheet than even the simplest accounting program. The spreadsheet DSL radically changes the nature of programming for a specific problem.

DSLs in Java programming

Closer to home, Java code uses DSLs everywhere:

  • JSP tags make it easier to build custom user interfaces.
  • SQL represents database operations.
  • Properties files represent program configuration.
  • XML describes data.
  • XML describes program configuration, such as within EJB, Hibernate, or Spring.
  • XML describes action, such as Ant tasks or business rules in certain engines.

The Java language is not especially good at domain-specific languages because the language is difficult to extend in ways that are most attractive to a DSL developer. That's one of the reasons that you see such a proliferation of XML. It's malleable, Java integrates with it well, you can easily build tools to interpret it, and it doesn't need to be compiled with Java classes. But XML is not friendly for humans to read. As a result, you're starting to see broader complaints about the overuse of XML with the Java language.

DSLs with Ruby and Active Record

You saw Active Record, the persistence engine behind Ruby on Rails, in the first article in the Crossing borders series. I come back to Active Record for this article because it makes fantastic use of DSL concepts in several places:

  • A domain-specific sentence structure and vocabulary. Active Record builds a vocabulary for wrapping relational database tables with Ruby objects. Within a database-backed object, you can use has_many :people, for example, to build a one-to-many relational mapping to another database-backed object.
  • Extending the behavior of a class. Based on naming conventions, declaring an Active Record class called People creates a class that has an attribute for every column in the database.
  • Embellishing existing types. Rails often embellishes classes like Fixnum to provide a domain-friendly experience.
  • Dynamically extending your vocabulary. Active Record provides some pleasant surprises, such as adding custom finders based on the structure of the database.
  • Modeling English. Active Record changes the pluralization of a class based on context.

As you read on, you'll see the Ruby features that make these tricks possible. You'll really notice a difference between the Ruby way and the Java way of doing things. To code along in this article, you need to install Ruby and Ruby on Rails, which includes Active Record (see Resources).


Vocabulary in Ruby

The open structure of Ruby's syntax and the inclusion of symbols make it relatively easy to define a vocabulary. You use methods, symbols, and classes to shape your vocabulary. Fire up the Ruby interpreter by typing irb. Enter the code in Listing 1. (Listing 1 shows both what you type and the result in Ruby. Enter just the code shown in boldface.)

Listing 1. Creating a Ruby class
irb(main):001:0> class Person
irb(main):002:1> attr_accessor :name, :email
irb(main):003:1> end
=> nil
irb(main):004:0> person = Person.new
=> #<Person:0x2b61a80>
irb(main):005:0> person.name = "Elvis"
=> "Elvis"
irb(main):006:0>

In Listing 1, you created a class called Person, with two instance variables called name and email. Pay special attention to the line with attr_accessor :name, :email. Two concepts should capture your attention:

  • Method invocations in a class definition
  • The use of symbols

Method invocations

The attr_accessor :name, :email statement in Listing 1 creates two attributes, with getter and setter accessors. attr is actually a method call -- a wonderful example of metaprogramming within the Ruby language itself. Java developers are used to seeing method declarations within a class body, but not method invocations. This method invocation adds methods and instance variables to the Person class.

Without attr_accessor :name, :email, you'd need to type the code in Listing 2 for each attribute you want:

Listing 2. Ruby accessors
def name=(value)
 @name = value
end

def name
 return @name
end

Listing 2 -- Ruby's version of getters and setters -- should look somewhat familiar. name= is actually a method name, and @ prefixes all instance variables, but the rest is pretty similar to Java getters and setters.

Instead of the code in Listing 2, you can use different versions of @attr to create attributes with getters, setters, or both.

Symbols

The second noteworthy concept is the symbol. Think of :email as the thing named email. A Ruby symbol is like a string, but one that is immutable and has a single instance. You can use only one :email symbol.

Now Active Record code like this should make some sense to you:

class Manager < ActiveRecord::Base
 has_one :department
end

has_one is a method, and :department is a symbol, which Active Record simply interprets as the name of a class. Because Ruby doesn't force parentheses around method arguments and because Rails can use symbols and method names designed specifically for Active Record, the vocabulary flows nicely.

Optional extensions

Active Record makes good use of another Ruby feature. You'll often see Ruby methods with an optional parameter that's a hash map that defaults to empty. You can simulate named parameters that way. For example, the definition of the Active Record method called belongs_to looks like this:

def belongs_to(association_id, options = {})

You can now pass options to belongs_to to refine the behavior:

class Manager < ActiveRecord::Base
 has_one :department, :foreign_key => "department_number"
end

In Ruby, you specify a hash map entry with key => value. The meaning is clear. You want Active Record to override the default (department_id, based on naming conventions) and use department_number instead. Because you can tailor the names of the options to fit your grammar, the DSL gets yet another powerful feature: optional extensions. Next, you need the ability to use your vocabulary to extend the Ruby language.


Embellishing existing types

Ruby is a dynamic language, and adding behavior to an existing class (or even an instance of a given class) is easy. You'll use the technique first to embellish an existing class for a domain, and later to extend existing classes based on a vocabulary.

Roman numerals aren't frequently used, but they are useful in some contexts. You wouldn't want to add Roman numerals directly to Ruby's base Fixnum class, but they might make a useful addition to a domain-specific language. You can add a to_roman method to your Fixnum class that converts a fixnum to a Roman numeral. It's surprisingly easy to do. You simply open the class definition again and define your new method. Listing 3 shows a crude Roman numeral processing method:

Listing 3. A Roman numeral processing method
class Fixnum
 def to_roman
  value = self
  str = ""
  (str << "C"; value = value - 100) while (value >= 100)
  (str << "XC"; value = value - 90) while (value >= 90)
  (str << "L"; value = value - 50) while (value >= 50)
  (str << "XL"; value = value - 40) while (value >= 40)
  (str << "X"; value = value - 10) while (value >= 10)
  (str << "IX"; value = value - 9) while (value >= 9)
  (str << "V"; value = value - 5) while (value >= 5)
  (str << "IV"; value = value - 4) while (value >= 4)
  (str << "I"; value = value - 1) while (value >= 1)
  str
 end
end

Listing 3 is straightforward, once you understand that semicolons separate two distinct Ruby statements. I often use them when I want two distinct ideas to hang together. You can add to or change the definition of any Ruby class with this technique. The nice thing about this particular implementation is the usage model. You can paste it into a file and use it from the Ruby interpreter, as in Listing 4:

Listing 4. Using the to_roman extension
irb(main):001:0> load 'to_roman.rb'
=> true
irb(main):002:0> 10.to_roman
=> "X"
irb(main):003:0> 199.to_roman
=> "CXCIX"
irb(main):004:0>

Rails uses this capability to handle things like time measurements. For example, in a Rails application, you can say 10.days, or 2.hours.ago, or 5.minutes.from_now. With this technique, you can extend the existing Ruby vocabulary into your domain to handle things like measurements, conversions, or other syntactic sugar. The end result is a clean and lean Ruby core class, with little extensions that provide domain-specific classes that do exactly the right thing in the context of the domain.


Building a class dynamically

After you've got a vocabulary and the ability to extend a class, the next step is to extend a class based on your vocabulary dynamically. You saw an example of this technique in Listing 1 with attr. Now I'll show you how to implement it (with thanks to Glenn Vanderburg; see Resources). Listing 5 shows a first attempt:

Listing 5. Initial attempt to extend a class dynamically
class Person
 def my_attr
  self.class.class_eval "def name; @name; end"
  self.class.class_eval "def name=(val); @name = val; end"
 end
end

This example gets a little more complicated. self.class returns the class for Person. Next, class_eval evaluates the following string in the context of the class. The first line defines the getter, and the second line defines the setter. This example adds the name attribute to Person.

Listing 5 has two major problems. First, you need to call my_attr explicitly. You can't just invoke it from the class because it hasn't been defined yet. Second, the hard-coded name should be a symbol. You can solve the first problem by declaring a module and inheriting from the module. You can solve the second by passing in a symbol. Listing 6 shows the second attempt:

Listing 6. Second attempt to extend a class dynamically
class Module
 def my_attr(symbol)
  class_eval "def #{symbol}; @#{symbol}; end"
  class_eval "def #{symbol}=(value); @#{symbol} = value; end"
 end
end

Listing 6 looks a little more cryptic, but don't worry. You can crack the code with a little more help. You've changed only three things:

  • Instead of declaring a new Person class, you're opening the superclass of Ruby's Class.
  • Instead of hard-coding a name, you're passing in a parameter called symbol. You replaced occurrences of name with #{symbol}. Ruby substitutes #{symbol} with a string representing the symbol.
  • You use class_eval instead of self.class.class_eval. Your code is already operating in the class, so you don't need to get self.class.

See if it works by entering the boldfaced code shown in Listing 7 in the Ruby interpreter:

Listing 7. Defining custom attributes
irb(main):001:0> require "my_attr.rb"
=> true
irb(main):002:0> class Person
irb(main):003:1> my_attr :name
irb(main):004:1> end
=> nil
irb(main):005:0> person = Person.new
=> #<Person:0x2b5fb90>
irb(main):006:0> person.name = "Bruce"
=> "Bruce"
irb(main):007:0> person.name
=> "Bruce"

As expected, you can add the behavior to any existing class. Now you see how you can tie behavior to additional capabilities that you can add to the class. This technique is exactly how Active Record adds high-level concepts, such as belongs_to and has_many. But instead of adding them to a class, Active Record adds them to the module called ActiveRecord::Base.

You've already seen some fairly sophisticated capabilities in action, but Ruby can do still more to enable DSLs.


method_missing and dynamic behavior

Sometimes you want to add methods to a class based on external conditions. For example, suppose you want to express Roman numerals in Ruby. To distinguish them from strings, you use the form Roman.III to represent the number 3 as a Roman numeral. It's not really practical to add a class method to Roman for every possible Roman numeral, and with Ruby you don't need to. You can use a little trickery.

In Ruby, when a method is missing, Ruby simply calls the method_missing method. You can override it to provide your Roman numerals, as shown in Listing 8:

Listing 8. Overriding the method_missing method
class Roman
 def self.method_missing name, *args
  roman = name.to_s
  if(roman =~ /^[IVXLC]*$/)
   roman.gsub!("IV", "IIII")
   roman.gsub!("IX", "VIIII")
   roman.gsub!("XL", "XXXX")
   roman.gsub!("XC", "LXXXX")
   return(roman.count("I") +
       roman.count("V") * 5 +
       roman.count("X") * 10 +
       roman.count("L") * 50 +
       roman.count("C") * 100)
  else
   super(name, *args)
  end
 end
end

This code is relatively simple, but it does use some Ruby features that are unfamiliar to Java programmers. You override method_missing, so whenever a client of this class tries to invoke a method that doesn't exist, Ruby calls this method. Here are the details:

  1. You take two arguments:
    • a name for the name of the method
    • *args for the arguments for the missing method
  2. The name is a symbol, so you first translate it to a String with to_s.
  3. You use a regular expression to make a reasonable guess that the number is a Roman numeral.
  4. If the number is a Roman numeral, you make a series of substitutions to make the Roman numeral easier to process. IV is four and IX is nine, so you can't derive their values simply by counting the occurrences of X, V, and I.
  5. You assign a value to each occurrence of I (1), V (5), X (10), L (50), or C (100).
  6. If the method is not a Roman numeral, you invoke your superclass, which reports a missing method.

This technique is incredibly powerful when it comes to DSLs. Active Record uses this capability to implement dynamic finders. Instead of physically adding a finder for each column, Active Record uses method_missing. With this strategy, Active Record can match not just one column, but combinations of columns. For example, adding name and email columns to a people table enables the People.find_by_name_and_email finder from the Person class. Details like this make the user experience of Active Record much more pleasant. They also keep the implementation of Active Record lean and mean, so you can often implement your own patches when it doesn't do exactly what you want.


DSLs in Java programming, revisited

When you use the Java language, your options are much more limited. Metaprogramming is harder, so you can't always capture the Active Record experience. But if you're in dire need of a DSL, you do have options. And you don't need to reach for XML or annotations in all cases. Here are some common approaches:

  • For less-demanding DSLs, you can use Java classes, methods, and names to build an English-friendly vocabulary and do what you need through message invocations.
  • For typical Java audiences, you can build your language in XML. XML is hard to read, but it might be useful in some cases, and it's certainly common in the Java space.
  • For solutions already requiring XML, you can use a derivative of XML to simplify. Craig Walls has a post on doing just this for the Spring context with XBean (see Resources).
  • You can use an alternative representation of XML, such as Relax NG, to simplify your XML (see Resources).
  • When Java code and XML are not enough, you can embed a language in the JVM. The best way is through BeanShell (see Resources).
  • For solutions requiring dynamic scripting within a Java application, you can leverage a more dynamic language that already has Bean Scripting Framework (BSF) integration. Good examples are Jython, JRuby, and Groovy (see Resources).
  • You can build a DSL from scratch. That's hard to do in the Java language, but it might be worth it for some applications.

Each one of these ideas alone could represent a series of developerWorks articles, so I won't go into great detail for them here, but I'll drill a bit deeper. If you need to consume a DSL from the Java language, you need to ask yourself four questions:

  • Do you really need a DSL? Through some clever use of Java technology, you might be able to do what you need.
  • Is XML or an XML derivative enough? Java developers are often a little overzealous with XML, but some of the derivatives simplify things somewhat.
  • Can you use another language within Java? JRuby is getting much better, Groovy is settling down, and Jython is getting much more solid.
  • Could it be worth it to build a DSL from scratch? It's hard to do with the Java language -- you'll need a lexer, a parser, and a grammar. But it can be done, and it might be worth doing, depending on your application.

Conclusion

I don't have many definitive answers this time. The Java language simply doesn't do DSLs very well, but it pays to know what's possible in other languages should you need to build a DSL. You should also be aware of a new breed of research. The people who brought you the IDEA IDE and a few other companies are working on a set of products called language workbenches (see Resources). They have a chance to revolutionize completely the way that we code. These ideas -- many of them happening beyond Java programming -- are expanding the borders of DSL.

Next time I'll talk about problems with concurrent programming. You'll take a look at Erlang as a possible solution for soft, real-time distributed systems.

Resources

Learn

Get products and technologies

  • Ruby on Rails: Download the open source Ruby on Rails Web framework. Ruby on Rails with Active Record makes significant use of domain-specific languages throughout.
  • Ruby: Get Ruby from the project Web site.

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Java technology on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Java technology
ArticleID=107337
ArticleTitle=Crossing borders: Domain-specific languages in Active Record and Java programming
publish-date=04042006