Evolutionary architecture and emergent design

Using DSLs

Capture idiomatic domain patterns using domain-specific languages


Content series:

This content is part # of # in the series: Evolutionary architecture and emergent design

Stay tuned for additional content in this series.

This content is part of the series:Evolutionary architecture and emergent design

Stay tuned for additional content in this series.

Idiomatic patterns can be either technical or domain. Technical patterns represent solutions to common technical software problems, such as how you handle validations, security, and transactional data within your application (or suite of applications). Previous installments have focused on harvesting technical idiomatic patterns using techniques such as metaprogramming. Domain patterns concern how you abstract common business problems. Whereas technical patterns appear in virtually all kinds of software, your domain patterns differ as much as one business differs from another. However, a rich set of techniques exists for harvesting them, which is the subject of this and the next few installments of this series.

This article provides motivation for using DSL techniques as an abstraction style for harvesting domain patterns. DSLs offer a wealth of options, including their own pattern nomenclature. Martin Fowler's most recent book is a deep-dive investigation of DSL techniques (see Related topics). I'll use many of his pattern names and a mix of his and my own examples in subsequent installments as I start showing specific techniques.

Motivation for DSLs

Why go to all the trouble of creating a DSL just to harvest an idiomatic pattern? As I pointed out in "Leveraging reusable code, Part 2," one of the best ways to differentiate an idiomatic pattern from the rest of your code is to make it look different. This visual differentiation is an immediate clue that you're not looking at a regular API. Similarly, one of the goals of using DSLs is to write code that looks less like source code and more like the problem you're trying to solve. If you can achieve this goal (or even get closer to it than you are now), you bridge an important gap in most software projects: communication between developers and business stakeholders. Allowing users to read your code is a huge benefit because it eliminates the need to translate your code into the vernacular, a task that is prone to error. By making your code readable to the nontechnical people who know what the software is supposed to do, you can have a more engaged conversation with them.

By way of motivation for using this technique, I'll borrow one of Fowler's examples from his DSL book (see Related topics). Let's say I work for a company that makes software-controlled secret compartments (think James Bond). One of my company's clients, Mrs. H., wants us to install a secret compartment in her bedroom. However, my company uses Java™-powered toasters left over from the dot-com bust to run the software. Although the toasters are cheap, it is expensive to reflash the software on them. So, I need to create the basic secret-compartment code and put it permanently on the toasters, then figure out a way to configure each individual client's secret-compartment needs. As you recognize, this is a common problem in the modern software world: general behavior that doesn't change often, coupled with configuration that you can change for individual circumstances.

Mrs. H. wants a secret compartment that opens when you first close the bedroom door, then opens the second drawer of her dresser, and finally turns on the bedside light. These activities must happen in sequence, and if anything breaks the sequence, you must start over from the beginning. You can imagine the software that controls her secret compartment as the state machine illustrated in Figure 1:

Figure 1. Mrs. H.'s secret compartment as a state machine
state machine diagram
state machine diagram

The underlying state machine API is simple. I create an abstract event class, shown in Listing 1, that handles both events and commands within the state machine:

Listing 1. Abstract events for state machine
public class AbstractEvent {
  private String name, code;

  public AbstractEvent(String name, String code) {
    this.name = name;
    this.code = code;
  public String getCode() { return code;}
  public String getName() { return name;}

I can model the states in the state machine with another simple class called States, shown in Listing 2:

Listing 2. The start of the state-machine class
public class States {
  private State content;
  private List<TransitionBuilder> transitions = new ArrayList<TransitionBuilder>();
  private List<Commands> commands = new ArrayList<Commands>();

  public States(String name, StateMachineBuilder builder) {
    super(name, builder);
    content = new State(name);

  State getState() {
    return content;

  public States actions(Commands... identifiers) {
    return this;

  public TransitionBuilder transition(Events identifier) {
    return new TransitionBuilder(this, identifier);

  void addTransition(TransitionBuilder arg) {

  void produce() {
    for (Commands c : commands)
    for (TransitionBuilder t : transitions)

Listing 1 and Listing 2 are just here for reference. The interesting problem to solve is the representation of the configuration of the state machine. This representation is an idiomatic pattern for the business of installing secret compartments. Listing 3 shows a Java-based configuration for the state machine:

Listing 3. One configuration option: Java code
Event doorClosed = new Event("doorClosed", "D1CL");
Event drawerOpened = new Event("drawerOpened", "D2OP");
Event lightOn = new Event("lightOn", "L1ON");
Event doorOpened = new Event("doorOpened", "D1OP");
Event panelClosed = new Event("panelClosed", "PNCL");

Command unlockPanelCmd = new Command("unlockPanel", "PNUL");
Command lockPanelCmd = new Command("lockPanel", "PNLK");
Command lockDoorCmd = new Command("lockDoor", "D1LK");
Command unlockDoorCmd = new Command("unlockDoor", "D1UL");

State idle = new State("idle");
State activeState = new State("active");
State waitingForLightState = new State("waitingForLight");
State waitingForDrawerState = new State("waitingForDrawer");
State unlockedPanelState = new State("unlockedPanel");

StateMachine machine = new StateMachine(idle);

idle.addTransition(doorClosed, activeState);

activeState.addTransition(drawerOpened, waitingForLightState);
activeState.addTransition(lightOn, waitingForDrawerState);

waitingForLightState.addTransition(lightOn, unlockedPanelState);

waitingForDrawerState.addTransition(drawerOpened, unlockedPanelState);

unlockedPanelState.addTransition(panelClosed, idle);


Listing 3 highlights several problems with using Java code for the state machine's configuration. First, it's not obvious by reading it that this is the configuration for a state machine. Like most Java APIs, it is a pile of undifferentiated code. Second, it is verbose and repetitive. For example, variable names are used over and over again as I set more and more states and transitions for each part of the state machine. All this duplication makes the code harder to read. Third, this code doesn't meet the original goal of being able to configure secret compartments without recompiling code.

Actually, you almost never see code like this anymore in the Java world, which tends to prefer XML for configuration code. Writing the configuration in XML is simple, as shown in Listing 4:

Listing 4. State-machine configuration in XML
<stateMachine start = "idle">
    <event name="doorClosed" code="D1CL"/>
    <event name="drawerOpened" code="D2OP"/>
    <event name="lightOn" code="L1ON"/>
    <event name="doorOpened" code="D1OP"/>
    <event name="panelClosed" code="PNCL"/>

    <command name="unlockPanel" code="PNUL"/>
    <command name="lockPanel" code="PNLK"/>
    <command name="lockDoor" code="D1LK"/>
    <command name="unlockDoor" code="D1UL"/>

  <state name="idle">
    <transition event="doorClosed" target="active"/>
    <action command="unlockDoor"/>
    <action command="lockPanel"/>

  <state name="active">
    <transition event="drawerOpened" target="waitingForLight"/>
    <transition event="lightOn" target="waitingForDrawer"/>

  <state name="waitingForLight">
    <transition event="lightOn" target="unlockedPanel"/>

  <state name="waitingForDrawer">
    <transition event="drawerOpened" target="unlockedPanel"/>

  <state name="unlockedPanel">
    <action command="unlockPanel"/>
    <action command="lockDoor"/>    
    <transition event="panelClosed" target="idle"/>

  <resetEvent name = "doorOpened"/>

The code in Listing 4 has several advantages over the Java version. First, I have late binding, which means that I can make changes to the configuration code and drop it in the toaster, allowing an XML parser to read the new configuration. Second, for this particular problem, this code is much more expressive because XML includes the concept of containership: States include their configuration as child elements. This helps remove the annoying redundancy that appears in the Java version. Third, this code is inherently declarative. Often, declarative code is much easier to read if you're just making statements and don't need if and while syntax.

Take a step back for a moment and realize the implications. Externalizing configuration is such a common pattern in the modern Java world that we don't even think of it as a distinct entity anymore. However, it is a feature of virtually every Java framework. Configuration is an idiomatic pattern, and we need ways of capturing it that separate and differentiate it from the general behavior of the surrounding framework. By using XML for configuration, I am writing the code in an external DSL (the syntax is XML and the grammar is defined by the schema associated with this XML document) so that I don't need to recompile my framework code to make changes to it.

I needn't go all the way to XML to get the advantages of XML. Consider the version of the configuration code shown in Listing 5:

Listing 5. A custom-grammar state-machine configuration
  doorClosed  D1CL
  drawerOpened  D2OP
  lightOn     L1ON
  doorOpened  D1OP
  panelClosed PNCL


  unlockPanel PNUL
  lockPanel   PNLK
  lockDoor    D1LK
  unlockDoor  D1UL

state idle
  actions {unlockDoor lockPanel}
  doorClosed => active

state active
  drawerOpened => waitingForLight
  lightOn    => waitingForDrawer

state waitingForLight
  lightOn => unlockedPanel

state waitingForDrawer
  drawerOpened => unlockedPanel

state unlockedPanel
  actions {unlockPanel lockDoor}
  panelClosed => idle

This version of the code has many of the benefits of the XML version: it is declarative, has containership, and is concise. It has advantages over both the XML and Java versions because it has far fewer noise characters (such as < and >) that are required for the technical implementation but harm readability.

This version of the configuration code is a custom external DSL written using ANTLR, an open source tool that makes it easy to write your custom languages (see Related topics). Those of you who still have nightmares about compiler class in university (including such classic tools as Lex and YACC) will be glad to know the tools have gotten much better. This example is from Fowler's book, and he says that building the XML version and building the custom-language version took about the same amount of time.

Listing 6 contains another alternative, written in Ruby:

Listing 6. State-machine configuration in JRuby
event :doorClosed, "D1CL"
event :drawerOpened, "D2OP"
event :lightOn, "L1ON"
event :doorOpened, "D1OP"
event :panelClosed, "PNCL"

command :unlockPanel, "PNUL"
command :lockPanel, "PNLK"
command :lockDoor, "D1LK"
command :unlockDoor, "D1UL"

resetEvents :doorOpened

state :idle do
  actions :unlockDoor, :lockPanel
  transitions :doorClosed => :active

state :active do
  transitions :drawerOpened => :waitingForLight,
              :lightOn => :waitingForDrawer

state :waitingForLight do
  transitions :lightOn => :unlockedPanel

state :waitingForDrawer do
  transitions :drawerOpened => :unlockedPanel

state :unlockedPanel do
  actions :unlockPanel, :lockDoor
  transitions :panelClosed => :idle

This is a good example of an internal DSL: a DSL that uses the syntax of a base language, which means this DSL must be syntactically legal Ruby code. (Because it's written in Ruby, you can run it via JRuby, which means that all your toaster needs is the JRuby JAR file.)

Listing 6 has many of the same advantages as the custom language. Notice heavy use of Ruby blocks to act as containers, which gives you the same kind of containership semantics as the XML and custom-language versions. It does use a few more noise characters than the custom language. For example, the : prefix in Ruby indicates a symbol, which in this case is basically an immutable string used as an identifier.

Implementing this kind of DSL in Ruby is quite simple, as shown in Listing 7:

Listing 7. Partial class definition for the JRuby DSL
class StateMachineBuilder
  attr_reader :machine, :events, :states, :commands

  def initialize
    @events = {}
    @states = {}
    @state_blocks = {}
    @commands = {}

  def event name, code
    @events[name] = Event.new(name.to_s, code)

  def state name, &block
    @states[name] = State.new(name.to_s)
    @state_blocks[name] = block
    @start_state ||= @states[name]

  def command name, code
    @commands[name] = Command.new(name.to_s, code)

Ruby has flexible rules about syntax, which makes it well suited for this type of DSL. For example, when declaring an event, you're not forced to include the parentheses as part of the method call. In this version, you don't need to write your own language or hurt yourself with angle brackets. This helps illustrate why this approach is so popular in the Ruby world.

Characteristics of DSLs

DSL offers a nice alternative syntax for capturing idiomatic patterns. As defined by Martin Fowler, DSLs have five key characteristics.

Computer programming language

To be a DSL, a language must be a computer programming language. Without this restriction in place, it's easy to get on a slippery slope where everything you encounter is a DSL. If you define the term DSL too broadly, then all contextualized conversation would be DSLs. For example, I have colleagues who are cricket enthusiasts. When I walk by them while they are talking about cricket, I can't understand what they are saying even though they are using English words. I lack the appropriate context to understand the way they are using those words. Thus, you could argue that cricket and other sports have DSLs in their terminology. But leaving the definition this broad makes it hard to narrow it to useful constraints — hence Fowler's insistence on restricting it to computer programming languages.

Language nature

Fowler's second criterion for a DSL is that it have a "language nature," which means that your DSL should be at least vaguely readable by nonprogrammers. This language nature can take a variety of forms, many of which I will show you in subsequent installments as I continue investigating the use of DSLs as a way to capture idiomatic patterns.

Domain focus

To be a proper DSL, a language must be narrowly focused on a particular problem domain. One of the hazards of trying to create DSLs is making them too broad. DSLs are an abstraction mechanism, and creating an abstraction that is too broad decreases the benefits of the abstraction.

Limited expressiveness

It is also typical of DSLs to have limited expressiveness. For example, it is quite rare to find a DSL that includes control structures such as looping and decisions. The DSL should be focused particularly and solely on the domain it is trying to describe. As a result, quite a few DSLs are declarative rather than imperative.

Not Turing complete

The preceding two criteria suggest this characteristic, but I'll formalize it here. Your DSL should not be Turing complete (see Related topics). In fact, it is considered an antipattern in DSLs for them to become Turing complete accidentally. For example, the classic UNIX®sendmail configuration file is accidentally Turing complete. You could write an operating system into the sendmail configuration file if you are so inclined and have way too much free time on your hands.

It is surprisingly easy to become Turing complete accidentally. Some familiar infrastructure tools have made that transition accidentally — XSLT, for example. The determination whether or not a language is a DSL sometimes depends on the context in which it's being used. When using XSLT to transform one version of text to another version of text, you are using it as a DSL. If you use XSLT to solve the Towers of Hanoi problem, you are using it as a Turing complete language (and you should probably try to find a new hobby).


In this installment, I laid the groundwork for using DSLs as an extraction mechanism for harvesting idiomatic patterns. DSLs work quite nicely for this purpose because they are easily differentiated from regular APIs, they tend to be declarative in nature, and they improve the communication feedback loop between developers and nondevelopers on projects. In future installments, I'll investigate multiple techniques for constructing DSLs. In the next few installments, I'll demonstrate several DSL techniques that you can leverage in your quest for discovery and design in code.

Downloadable resources

Related topics

  • The Productive Programmer (Neal Ford, O'Reilly Media, 2008): Neal Ford's most recent book expands on a number of the topics in this series.
  • ANTLR: ANTLR is a powerful open source tool for building languages and grammars.
  • Domain Specific Languages (Martin Fowler, Addison-Wesley, 2010): Fowler's new book is available in beta form.
  • Turing completeness: Read Wikipedia's article on this concept.


Sign in or register to add and subscribe to comments.

Zone=Java development
ArticleTitle=Evolutionary architecture and emergent design: Using DSLs