Evolutionary architecture and emergent design: Building DSLs in JRuby

Leverage Ruby's expressiveness by using JRuby atop Java code

Ruby is the current state-of-the-art language for constructing internal domain-specific languages (DSLs). One of the best Ruby implementations is the one that runs on the JVM: JRuby. In this installment of Evolutionary architecture and emergent design, you'll learn how to leverage Ruby's expressiveness yet keep the benefits of your existing (and future) Java™ code. You'll see how to construct internal DSLs in Ruby as a way of capturing domain idiomatic patterns.

Share:

Neal Ford, Software Architect / Meme Wrangler, ThoughtWorks Inc.

Neal FordNeal Ford is a software architect and Meme Wrangler at ThoughtWorks, a global IT consultancy. He also designs and develops applications, instructional materials, magazine articles, courseware, and video/DVD presentations, and he is the author or editor of books spanning a variety of technologies, including the most recent The Productive Programmer. He focuses on designing and building large-scale enterprise applications. He is also an internationally acclaimed speaker at developer conferences worldwide. Check out his Web site.



28 September 2010

Also available in Chinese Japanese

A few installments ago, I began covering the harvesting of domain idiomatic patterns (solutions to emergent business problems) using domain-specific languages. DSLs work nicely for this task because they are concise (containing as little noisy syntax as possible) and readable (even by nondevelopers), and they stand out from more API-centric code. In the last installment, I showed how to build DSLs in Groovy, taking advantage of several of its features. In this installment, I'll wrap up this discussion of using DSLs to harvest idiomatic patterns by showing how to build more-sophisticated DSLs in Ruby, leveraging JRuby.

About this series

This series aims to provide a fresh perspective on the often-discussed but elusive concepts of software architecture and design. Through concrete examples, Neal Ford gives you a solid grounding in the agile practices of evolutionary architecture and emergent design. By deferring important architectural and design decisions until the last responsible moment, you can prevent unnecessary complexity from undermining your software projects.

Ruby is currently the most popular language for building internal DSLs. Most of the infrastructure you think about when developing in Ruby is DSL-based — Ruby on Rails, RSpec, Cucumber, Rake, and many others (see Resources) — because it is amenable to hosting internal DSLs. And the trendy technique of behavior-driven development (BDD) required a strong DSL base to achieve its popularity. This installment will help you understand why Ruby is so popular among DSL aficionados.

Open classes in Ruby

Using open classes to add new methods to a built-in class is a common technique for adding expressiveness to DSLs. In the last installment, I showed two different syntaxes in Groovy for open classes. In Ruby, you have the same mechanism but with a single syntax. For example, to create a recipe DSL, you need a way to capture quantities. Consider the DSL fragment in Listing 1:

Listing 1. Target syntax for my Ruby-based recipe DSL
recipe = Recipe.new "Spicy bread"
recipe.add 200.grams.of "flour"
recipe.add 1.lb.of "nutmeg"

To make this code executable, I must add the gram and lb methods to numbers by opening the Numeric class, as shown in Listing 2:

Listing 2. Open class definitions in Ruby
class Numeric
  def gram
    self
  end
  alias_method :grams, :gram

  def pound
    self * 453.59237
  end
  alias_method :pounds, :pound
  alias_method :lb, :pound
  alias_method :lbs, :pound

In Ruby, class names must start with a capital letter, which is also the rule for Ruby constants, meaning that every class name is also a constant. When Ruby "sees" a class definition, it checks to see if that class has already been loaded on its class path. Because class names are constants, you can have only one class of a given name. If the class is already loaded, the class definition reopens the class, allowing me to make changes. In Listing 2, I reopen the Numeric class (which handles both fixed and floating-point numbers) to add the gram and pound methods. Unlike Groovy, Ruby doesn't have the rule that methods accepting no parameters must be called with empty parenthesis, meaning that Ruby doesn't need to distinguish between properties and methods.

Ruby also includes another handy DSL mechanism: the alias_method class method. You want to enhance the fluency of your DSLs as much as possible, suggesting that you should handle cases like pluralization. (If you want to see elaborate efforts to achieve this result, check out the pluralization code in Ruby on Rails for handling pluralizing model class names.) I don't want to form grammatically clumsy sentences like recipe.add 2.gram.of("flour") in my DSL when I'm clearly adding more than one gram. The alias_method mechanism in Ruby makes it easy to create alternate names for methods to enhance readability. To that end, Listing 2 adds a pluralized method for gram, and both alternate abbreviations and pluralized versions for pound.


Building fluent interfaces

One of the goals of using a DSL to capture idiomatic patterns is the ability to eliminate noisy syntax from the programming-language version of your abstractions. Consider the snippet of noisy recipe DSL code in Listing 3:

Listing 3. Noisy recipe definition
recipe = Recipe.new "Spicy bread"
recipe.add 200.grams.of "flour"
recipe.add 1.lb.of "nutmeg"
recipe.directions << "mix ingredients"
recipe.directions << "cook for 30 minutes at 250 degrees"

Although the syntax in Listing 3 for adding recipe ingredients and directions is fairly concise, the noisy repetition there is embodied by the host variable name (recipe). A cleaner version appears in Listing 4:

Listing 4. Contextualized recipe definition
alternate_recipe = Recipe.new("Milky Gravy")
alternate_recipe.consists_of {
  add 1.lb.of "flour"
  add 200.grams.of "milk"
  add 1.gram.of "nutmeg"
  
  steps(
    "mix ingredients",
    "cook for some amount of time"
  )
}

The addition of the consists_of method to the fluent interface allows me to use containership (embodied in Ruby via closure blocks delimited with curly braces ({}) to eliminate the noisy host-object repetition. The implementation of this method is trivial in Ruby, as shown in Listing 5:

Listing 5. The Recipe class definition, including the consists_of method
class Recipe
  attr_reader :ingredients
  attr_accessor :name
  attr_accessor :directions

  def initialize(name="")
    @ingredients = []
    @directions = []
    @name = name
  end

  def add ingredient
    @ingredients << ingredient
    return self
  end
  
  def steps *direction_list
    @directions = direction_list.collect
  end
  
  def consists_of &block
    instance_eval &block
  end
end

The consists_of method accepts a code block. (That's the syntax you see with the ampersand before the parameter name. The ampersand identifies the parameter as the holder of a code block.) The method executes the code block using the instance_eval method, one of the built-in methods in Ruby. The instance_eval method executes the code passed to it by changing the definition of the host object. In other words, when you execute code via instance_eval, you change self (Ruby's version of the Java language's this) to be the variable that called instance_eval. Thus, you can call the add and steps methods without using the recipe host object if you call them with recipe.instance_eval, which is what the consists_of method does.

Regular readers will recognize this concept in the guise of Java syntax from the "Leveraging reusable code, Part 2" installment, reproduced here in Listing 6:

Listing 6. Fluentizing code blocks in Java code using instance initializers
MarketingDescription desc = new MarketingDescriptionImpl() {{
    setType("Box");
    setSubType("Insulated");
    setAttribute("length", "50.5");
    setAttribute("ladder", "yes");
    setAttribute("lining type", "cork");
}};

Although the syntax is passingly similar, the Java version suffers from a couple of serious limitations. First, it is unusual syntax in the Java language. (Most developers never encounter the instance initializer in everyday coding.) Second, because it uses anonymous inner classes (the only code-block-like mechanism in Java), any variables from the outer scope must be declared final, which places serious limitations on the kinds of things you can do inside the code block. In Ruby, the instance_eval method is a standard (and unexotic) language feature, meaning that it is more commonly used.


Polishing

One common technique many DSLs use (especially those targeting nondevelopers) is to leverage spoken languages. Molding computer syntax toward a spoken language is possible if your base computer language is flexible enough. Consider the recipe DSL I have created thus far. Creating an entire DSL just to hold simple data structures (like lists of ingredients and directions) seems like a bit of overkill; why not just keep this information in standard data structures? By encoding the operations in a DSL, I can take extra actions (like beneficial side effects) in addition to populating data structures. For example, perhaps I want to capture nutrition information for each ingredient as I define it in the DSL, allowing me to provide an aggregate value of the nutrition of the recipe when done. The NutritionProfile class is a simple data holder, shown in Listing 7:

Listing 7. Recipe nutrition record
class NutritionProfile
  attr_accessor :name, :protein, :lipid, :sugars, :calcium, :sodium

  def initialize(name, protein=0, lipid=0, sugars=0, calcium=0, sodium=0)
    @name = name
    @protein, @lipid, @sugars =  protein, lipid, sugars
    @calcium, @sodium = calcium, sodium
  end
  
  def self.create_from_hash(name, h)
    new(name, h['protein'], h['lipid'], h['sugars'], h['calcium'], h['sodium'])
  end

  def to_s()
    "\tProtein: " +   @protein.to_s       +
    "\n\tLipid: " +   @lipid.to_s         +
    "\n\tSugars: " +  @sugars.to_s        +
    "\n\tCalcium: " + @calcium.to_s       +
    "\n\tSodium: " +  @sodium.to_s
  end

end

To populate a database of these nutrition records, I create a text file that contains one record on each row:

ingredient "flour" has protein=11.5, lipid=1.45, sugars=1.12, calcium=20, and sodium=0

As you can probably guess, each line of this definition file is a Ruby-based DSL. Rather than think of its syntax as just a line of text, consider what it "looks" like from a computer-language standpoint, as shown in Figure 1.

Ingredient text definition as a method call
Ingredient text definition as a method call

Each line starts with ingredient, which is the method name. The first parameter is the name of the ingredient. The word has is called a bubble word — a word that makes the DSL more readable but doesn't contribute to the final definition. The rest of the line consists of name/value pairs, separated by commas. Given that this is not yet legal Ruby syntax, how do I translate it into Ruby? That job is called polishing: taking almost-legal syntax and polishing it into actual syntax. The job of polishing this DSL is handled by the NutritionProfileDefinition class, shown in Listing 8:

Listing 8. NutritionProfileDefinition class
class NutritionProfileDefinition
  
  def polish_text(definition_line)
    polished_text = definition_line.clone
    polished_text.gsub!(/=/, '=>')
    polished_text.sub!(/and /, '')
    polished_text.sub!(/has /, ',')
    polished_text
  end

  def process_definition(definition)
    instance_eval polish_text(definition)
  end

  def ingredient(name, ingredients)
    NutritionProfile.create_from_hash name, ingredients
  end    
   
end

The entry point of this class is the process_definition method, shown in Listing 9:

Listing 9. The process_definition method
def process_definition(definition)
  instance_eval polish_text(definition)
end

This method calls polish_text using instance_eval, switching the execution context of polish_text to the NutritionProfileDefinition instance. The polish_text method, shown in Listing 10, does the necessary substitutions and translations to convert the almost-code to code:

Listing 10. The polish_text method
def polish_text(definition_line)
  polished_text = definition_line.clone
  polished_text.gsub!(/=/, '=>')
  polished_text.sub!(/and /, '')
  polished_text.sub!(/has /, ',')
  polished_text
end

The polish_text method consists of simple string substitutions to convert the definition syntax into Ruby syntax, converting the equals sign to a hash identifier (=>), getting rid of excess instances of the word and, and converting has to a comma. This polished line of code is passed to instance_eval, executing it via the ingredient method of the NutritionProfileDefinition class.

You could write this code in the Java language, but Java's syntactic limitations would add so much noise that you would lose the benefits of the fluent interface, rendering the exercise moot. Ruby offers enough syntactic sugar to make it feasible (and desirable) to cast abstractions as DSLs.


Method missing

Unlike the preceding example, the next one cannot be done in Java code, even with cumbersome syntax. One convenient mechanism in languages that commonly host DSLs is method missing. When you call a method that doesn't exist in Ruby, it doesn't immediately generate an exception. You have an opportunity to add a method_missing method to your class that will handle any missing method calls. This is used heavily in DSLs that build internal data structures. Consider this example from the XMLBuilder in Ruby (see Resources), shown in Listing 11:

Listing 11. Using XMLBuilder in Ruby
xml = Builder::XmlMarkup.new(:indent => 2)
xml.person {
  xml.name("Neo")
  xml.catch_phrase("Whoa")
}
puts xml.target!

This code outputs an XML document with the structure shown in the DSL. Builder works its magic via method_missing. When you call a method on the xml variable, that method doesn't already exist, so it falls into method_missing, which constructs the corresponding XML. This makes the code for the Builder library very small; most of its mechanics rely on underlying language features of Ruby. One problem remains with this approach, however, as illustrated in Listing 12:

Listing 12. Method missing collisions with built-in methods
xml = Builder::XmlMarkup.new(:indent => 2)
xml.person {
  xml.name("Neo")
  xml.catch_phrase("Whoa")
  xml.class("pod-born")
}
puts xml.target!

Builders in Groovy vs. Ruby

The inspiration for the Builder class in Ruby came from similar builders classes in Groovy. Jim Weirich, the creator of Builder in Ruby, liked the concept but not the implementation in Groovy because it uses an elaborate mapping strategy between the XML tags and the generated XML. Weirich created XMLBuilder (and BlankSlate) as a simpler, more elegant solution to the problem. This is interesting because it offers a glimpse into how language communities tend to solve problems. In general, the Java community tends to build structural elements (such as frameworks and design patterns) to solve problems, building up abstraction layers upon layers. In Ruby, developers tend to use metaprogramming to build downward, using the simplest underlying mechanism they can leverage. Contrast any Java web framework with Ruby on Rails; or compare builders, where Weirich used metaprogramming to strip out what he didn't need, allowing him to leverage a language feature.

If you rely solely on method_missing, the code in Listing 12 won't work because the class method is already defined in Ruby as part of Object, which (as in the Java language), is the base class for all classes. Obviously, method_missing won't work with existing methods. This would seem to doom this approach. However, Jim Weirich (the creator of Builder), came up with an elegant solution: he created BlankSlate. BlankSlate is a class that inherits from Object but programmatically removes all the methods normally found in Object. This allows him to leverage the method_missing infrastructure without any annoying side effects.

This BlankSlate mechanism is so powerful and useful that it's being built into the next major version of Ruby. In Ruby 1.9, SimpleObject becomes the very top of the object hierarchy, with Object as its immediate descendent. Having SimpleObject makes building builder DSLs much easier because you'll no longer need BlankSlate.

The ability to create a DSL like Builder illustrates why expressiveness and power in languages are so critical. The amount of code in Ruby's Builder is much smaller than similar libraries from other languages because it was written atop a more flexible design medium: Ruby.


Conclusion

I've been making the case since the beginning of this series that the design of software systems encompasses its complete source code, which implies that you have a broader design palette if you use more-expressive languages. This applies not only to your choice of general-purpose language (Java, Ruby, Groovy, Clojure), but also to the languages you can write atop your base language using DSLs. Building a language that expresses your business concepts exactly becomes a valuable asset to your organization: you are capturing important ways of solving real problems in a language highly suited to the purpose.

Even if your organization won't switch to a language like Ruby or Groovy for most development, you can "sneak in" these languages by using tools implemented in them, such as RSpec and easyb (see Resources). By bringing these alternate languages in through the back door, you can help those who are needlessly wary of introducing new languages understand that they offer significant benefits.

Resources

Learn

  • The Productive Programmer (Neal Ford, O'Reilly Media, 2008): Neal Ford's most recent book expands on a number of the topics in this series.
  • JRuby: JRuby is one of the best implementations of Ruby on any platform.
  • Ruby on Rails: Rails is a popular web development platform that uses many DSLs.
  • RSpec: RSpec is a BDD testing framework written in Ruby using DSL techniques.
  • Rake: Rake is the build tool for the Ruby platform.
  • Cucumber: Cucumber is a powerful BDD testing framework that demonstrates many powerful DSL techniques in Ruby.
  • easyb: easyb is a Groovy-based BDD framework for the Java platform.
  • Builder: Builder is a Ruby library that makes it easy to generate XML documents programmatically.
  • developerWorks Java technology zone: Find hundreds of articles about every aspect of Java programming.

Discuss

  • Get involved in the My developerWorks community. Connect with other developerWorks users while exploring the developer-driven blogs, forums, groups, and wikis.

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Java technology on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Java technology
ArticleID=547775
ArticleTitle=Evolutionary architecture and emergent design: Building DSLs in JRuby
publish-date=09282010