In architecture design and software engineering, you need a clear comprehension of the domain for the architecture and to be able to effectively communicate that to other people. You can use various techniques and tools to approach this challenge, such as using a domain-specific language (DSL) and domain-specific modeling (DSM). DSM acts as the front end of DSL, allowing the user to express constructs through a visual representation.
This article focuses on using Eclipse Modeling Framework (EMF) and Graphical Modeling Framework (GMF) technologies to show how to produce DSM tooling aid for DSLs.
To develop DSL and DSM tools using EMF and GMF you need the following tools:
- Modeling and metamodeling concepts and techniques
- Modeling with EMF
- Using the GMF models for tool development
- Model to model transformation concepts and techniques
- Model to artifact transformation concepts and techniques
- Software engineering and programming:
- Java programming
- Understanding the EMF API
Domain-specific languages and modeling
There are many areas in everyday life where DSLs are applicable or used. Some real-life DSLs are: traffic and road signs, flat-pack furniture assembly plans, a chess game (and most other table games), and electronic circuits design. IT-related DSL examples include HTML, SQL, WS-* standards, Business Process Execution Language (BPEL), and POV scene description language.
DSM examples related to these IT examples include WYSIWYG HTML editors, BPEL editors, and POV Modeler.
Defining a DSL is valuable because DSLs help to focus on a specific area by setting a very specific scope. On one end, a carefully set scope ensures that knowledge and expertise is captured in a structured and detailed manner by the domain experts. On the other end, the expertise and skills are extracted and reused by the users of the domain.
Another important aspect of DSL is it enables you to specify granularity, which means finding the middle ground between a verbose description and a vague description of a domain, which is covered in the Defining DSLs section.
Why should you use DSLs instead of a general purposed language (GPL)? GPLs, compared with DSLs, use a vocabulary that is simple and basic enough to describe any domain without the specifics. The same level of expression and understanding of a domain is possible using GPLs, but the expected level of knowledge regarding the domain and the general language is considerably higher compared to a DSL approach. A major advantage of a DSL is that it requires significantly less time to understand and to communicate details of a domain. It also requires less time to learn to use the pertinent tooling.
For example, business applications are often implemented using complex software solutions, but most of the solutions use the same building blocks (patterns) to deliver business functions. A DSL enables you to abstract the software solution and hide the implementation details. A DSL can also use the vocabulary from the business domain and provide the translation for the IT domain.
Unfortunately, there are instances of badly used DSLs (or anti-patterns). In such cases, a DSL is used in the wrong context and often for the wrong purpose. A few examples of anti-patterns are:
- DSL with the wrong context, mixing the levels of abstractions (for example, mixing use cases and user interface details)
- DSL with the wrong context, with failure to separate concerns (such as mixing persistence details and user interface details)
- DSL with too many constraints, a result of rigid structure (for example, configuration details for a complex application)
- DSL with too many elements (often a result of not identifying patterns) or the model element instances end up in the metamodel
- DSLs with redundant content (for example, mixing user interaction flow for different types of user interfaces, including Web browser, and Web page interaction flow)
The domain is a well-defined subject area, described using a limited set of concepts. Domains in DSL and DSM are often captured in the form of a metamodel, as shown in Figure 1. Setting the right scope for a domain and defining the boundaries properly are key success factors (discussed later).
Domains for DSL can be defined in many different ways using different notations, techniques, and tools. Most often the domain specified for DSL is described as a metamodel. It is a popular approach because:
- Metamodels are self-describing, using DSL to capture the metamodel (domain) for the DSL.
- Metamodels are relatively easy to understand; they are similar to an entity-relationship model.
- Metamodels are easy to use and reuse for further processing (model-driven development).
- Metamodels fit well with the Model-Driven Architecture approach.
Figure 1. Concepts in the meta-instance quadrant

Using an entity-relationship model is one form of defining a DSL. There are other approaches and forms, but this article focuses on the entity-relationship modeling technique because it is more closely related to the technologies described later. Eclipse's EMF project is also following the entity-relationship modeling technique to capture DSLs. Entity-relationship models are relatively easy to capture, easy to maintain, easy to support with tooling, and the concepts are generally well understood.
The secret to successful domain specifications is having a good metamodel describing the domain. (Elements of a good metamodel are discussed later.) Capturing metamodels is not a trivial task and requires a fair amount of skill and experience. The skills can be acquired with practice. Capturing a metamodel is a type of abstraction, which is a skill often found with architects. The practical skill required for this activity is to be able to flatten a hierarchical (tree-like) structure.
There are different approaches, methods, or techniques to capturing a metamodel. The following example might help you get started.
- With paper and pencil, start to draw an instance of the model you want to have as
a result. Keep in mind the following simple concepts:
- You are drawing a graph, not a tree (although trees are also graphs).
- Use boxes to represent elements.
- Use directed connections between the boxes.
- Use boxes inside boxes for containment.
- Don't worry too much about the look and feel at this point, though it's good to have an idea of how to represent the elements later. Same goes for other elements, such as connections.
- Start thinking about the attributes of the various boxes.
- Keep the visual placement of the elements irrelevant; for example, box on the left, box(x,y) at (10,10).
- Consider capturing and drawing more than one instance.
- Consider capturing different representations (structure) of the same model. This approach can help you to understand the domain better, optimize the metamodel, and enable you to develop a tool-friendly metamodel.
- As soon as you have a good set of model instances that seem to represent the concepts you want to capture and communicate, start using a modeling tool (EMF here) to capture the metamodel and to flatten the hierarchy of elements. The constraints of the modeling environment and tooling will guide you to develop a valid metamodel.
- There is no easy way to test your metamodels, so the best option is trial and
error until the model (based on the metamodel you have built) closely matches what you envisioned. Fortunately, creating the tools is quick and easy so the trial process is relatively rapid.
Do not build diagram editors right away from the first instance of the metamodel; use simpler model editors, such as EMF tree editor, first.
- Once the tree editor works the way it is expected, it's time to build the first
diagram editor. It will likely be an iterative process as well, using the trial and error approach again.
The metamodel may not be "tooling friendly," meaning the constraints of the metamodel don't allow you to build the visual editor as originally planned. In this case, the only option is to go back to the original metamodel and try to capture the same concepts using another structure and constructs. Designing tooling friendly metamodels is a skill acquired from experience.
Consider the following items when trying to capture a good metamodel.
- Defining the scope
- Make sure the level of information captured in the model is consistent and relevant.
Capture only relevant information; do not build or use complex general models.
Capture the appropriate level of details in the models rather than mixing levels of abstractions.
Apply the separation of concerns principles while capturing the metamodel (componentize your models).
Make sure the information captured in the model is not duplicated in other models in the
development process (DSM). Ideally, all information has one authoritative view or model,
in which other models simply reflect the content.
For example, modeling an infrastructure would start with capturing of conceptual nodes focusing on location and possibly function, but without concern for performance, availability, or similar qualities. Next, you might build a more specific model, including the concerns discarded previously (performance, availability, and so on). At a very detailed level, the model might include physical details of the infrastructure, including network addresses, machine types, specs, and so forth.
- Granularity
- There is no magic formula to calculate or decide the right level of granularity.
Ask the following questions to help you decide:
- How much information should be captured in the model?
- How much is enough to have a good comprehension of the domain and is easy to communicate?
- When is it too much?
- Capturing too much detail is like developing code using a model or domain-specific language. In this case, not much is left for the transformation.
- Not capturing enough information in the model prevents the right level of comprehension and communication. In this case, most of the details have to be included in the transformation.
- Identifying domains
- With DSLs, it's valid to question how many domains should be identified. It depends on the context of the domain. Like many IT terms, domain is an overloaded term. Usually, you need to differentiate between two things when considering the number of domains to be specified.
First, there is always a larger domain that is correctly referred to as a domain. It is always applicable to a larger context where some of the elements are only remotely related. Java™ 2 Platform, Enterprise Edition (J2EE) applications are an example. This large domain includes Enterprise JavaBeans (EJBs), servlets, and JavaServer Pages (JSPs) components and other related artifacts.
Second, the context for domains in DSL is much smaller and more focused than in the preceding case. These domains should be called subdomains or micro-domains in case the specific domain term does not help. EJB modules in J2EE applications are an example. This specific domain only considers the various types of EJBs and the artifacts related to them.
- Taking care of the semantics
- All the possible arrangements of the elements (vocabulary) have a specific
interpretation in the domain. There is no ambiguity about how to arrange the language elements or how to interpret them.
All the things that need to be expressed in the defined scope of the domain can be expressed using the language (DSL).
Constraints defined in the metamodel are a key enabler for semantics. Constraints can ensure that invalid arrangements will not occur in the models.
- Tool friendliness
- With the metamodel, tool friendliness refers to how easy it is to design and develop a tool for a specific metamodel (DSL).
Being "tool friendly" is not a scientific measure; it's a way to express the quality of a
model from a tool development aspect.
Deciding on tool friendliness is a simple test: If it is easy to develop tooling (diagram
editor in this case) for a model, the model is tool friendly. Employing tool friendliness
is a good practice for enabling:
- Easier and faster tool development
- Rapid prototype development
- Increased tooling consumability
- Better understanding of the domain
- Better construction and a higher chance of reuse of the models
In engineering sciences like IT, experts use models, diagrams, and sketches to describe specific details of a problem or a solution. The need for visual representation arises from the high degree of complexity surrounding the industry. Abstraction and automation justify the need for visual modeling.
Engineers in IT work with various inputs, outputs, work products, and deliverables. One output may become the input for another work product. Models and diagrams are often part of or are the actual work products. The flow of information, the reuse of previous results, and the automation of the workflow justify the use of models over simple diagrams.
Building and using a tooling approach should enable:
- Better understanding of the domain for the architect or programmer
- A tool-friendly metamodel
- A high possibility for reuse
- Information about consumability and usability
- The design to getting closer to a metamodel
- Semantics in the captured model that are potentially shareable and reusable in other situations
In the spirit of "A picture is worth a 1000 words," a visual representation of a model using elements defined for a specific domain can tell the whole story of a solution. Although there are many techniques and technologies available to provide tooling support for DSM, this article discusses one option: DSM using EMF and GMF.
Before diving into the details of using EMF and GMF, let's look at a brief comparison between DSL/DSM and Unified Modeling Language (UML) paths. It is important to understand that the different paths are not there to compete with each other.
Modeling with DSL and UML are at the opposite ends of the spectrum, in some respects. UML is a unified (or general) modeling language; it can support literally any model. DSM is a domain-specific modeling language. It can only support specific types of models.
Using LEGO as an analogy, think of UML as a pile of basic, standard building blocks in a few different colors and a few different sizes. DSL is like a set of large-size building blocks from the medieval knights castle kit, rock colored and textured -- large elements resembling a whole stone wall segment of a medieval castle. You can imagine how one could build a very attractive, functional castle from the kit. But, the castle built from the basic blocks would lack the proper coloring, textures, and medieval characteristics.
There are two separate skill areas: skills required to define a DSL and the skills required to use the language.
- Developing the language (DSL) or the modeling tool (DSM)
- Using EMF and GMF requires the knowledge of EMF, which is nothing more than an entity-relationship modeling tool. GMF itself is a DSL you have to learn in order to develop the graphical diagram editor.
Development with UML2 requires a good understanding of the UML2 elements, their relationship, and how to define stereotypes to customize it. The graphical diagram editor is given. If the capabilities of the editor are not sufficient, there is a certain degree of customization available that requires specific development skills.
- Developing with the language (DSL) or the modeling tool (DSM) itself
- With EMF and GMF, the development is straightforward as soon as the specific domain is understood.
In case of UML2, the developer has to be familiar with the specific domain and UML2 itself .
UML2 has a larger user base than EMF and GMF, but the EMF/GMF camp is growing (judging by activities on mail lists and the increasing number of articles on the topic). On the contrary, when it comes to developing a DSL/DSM, EMF and GMF are often the technology of choice. The main reason for choosing EMF and GMF is the lack of support for some of the more complex constructs in UML diagrams.
Team development is an important aspect when comparing UML2 in IBM® Rational® Software Architect and EMF and GMF in RSA. Beyond the standard team development support based on the Eclipse workbench, the UML2 tools in Rational Software Architect have advanced modeling capabilities to support team development. These capabilities include model comparison, difference highlighting, and model merging. EMF currently does not have the same level of support for team development.
Exploring the EMF and GMF path
EMF is an established technology from the Eclipse workbench. Details about using EMF are outside the scope of this article. See Resources for an overview and information about advanced features.
GMF, a relatively new project in Eclipse, provides a framework for developing graphical tools and diagram editors for the Eclipse workbench. The framework consists of two parts:
- GMF run time (part of Eclipse and provides the basic, common elements and services for graphical tooling)
- GMF tooling (which helps the development of the Eclipse plug-ins that are using the GMF run time to deliver the final modeling tool)
The tooling for GMF itself is a set of DSLs specializing in the development of graphical modeling tools. Essentially, GMF consolidates all the functions, capabilities, and patterns from various graphical modeling tools (superset), extracts the variability from the superset of tools, and finally, presents the development patterns with a modeling front end.
GMF is ideal for developing tools for DSL/DSM because it:
- Is easy and fast
- Follows a model-driven approach and produces results without writing a single line of code
- Provides the level of sophistication to meet most of the complex and demanding requirements for graphical modeling tool development
- Produces high-quality tools
Figure 2 shows the most basic approach for building the tool for modeling, modeling itself, and producing various artifacts from the models.
Figure 2. Basic modeling approach

Essentially, there are two points to start from, as noted in the figure:
- Creating the metamodel first, without concern for the final artifacts
- Using the final artifacts to drive the development of the metamodel or to generate the metamodel from
Starting from the metamodel, the developer has to establish the metamodel or language for the specific domain. Constraints can enrich the metamodel to ensure the semantic correctness and validation of the model instances. Most of the graphical editor can be generated from the metamodel, but other parts have to be manually defined. A new set of constraints can be specified for the graphical editor because the graphical representation may use different constructs for modeling than the original metamodel. Model instances can be created and edited using the graphical editor. These models are the results of model-driven development. Models become the input to transformations to generate the final artifacts (code, for example).
The two entry points mentioned earlier relate to the two fundamental approaches to modeling: top-down and bottom-up.
The top-down modeling approach starts with capturing the metamodel, then developing the transformations and code generators to produce the artifacts. Table 1 shows the advantages and disadvantages of top-down modeling:
Table 1. Advantages and disadvantages of top-down modeling
| Advantages | Disadvantages |
|---|---|
|
|
The bottom-up modeling approach starts from the artifacts, building a model based on the instances, then finally defining the metamodel from the models. Table 2 shows the advantages and disadvantages of bottom-up modeling.
Table 2. Advantages and disadvantages of using bottom-up modeling
| Advantages | Disadvantages |
|---|---|
|
|
A combination of the previous two is the "meet in the middle" approach, which uses the best of both worlds. Development starts from both directions, creating the tooling from a conceptual metamodel and at the same time, creating a metamodel based on the artifacts. The link between the two resulting metamodels is established by a set of transformations. Table 3 shows the advantages and disadvantages of this combination approach.
Table 3. Advantages and disadvantages of using a "meet in the middle" approach
| Advantages | Disadvantages |
|---|---|
|
|
In reality, the top-down approach rarely produces the metamodel to support the generation of development artifacts. This approach is excellent to capture concepts and their relationships. Similarly, bottom-up approaches rarely produce the metamodel that is effective in communicating or describing a specific domain. Though it is possible to follow a purely top-down or bottom-up approach, it is challenging.
Constraints and validation are often discussed together in relation to metamodels and DSLs. Constraints ensure that the model instances are semantically correct. Validation is the process of checking the constraints on model instances. Validation can also provide feedback to the modeler, helping to correct the model.
In DSL terms, constraints ensure that the sentences constructed, using the vocabulary of the language, are meaningful. Constraints for a metamodel can be defined in different forms, including using:
- A constraint language, such as OCL
- A programming language, such as Java, at specific extension points
- Regular expressions, such as regexp
Constraints not only restrict certain constructs in a model, some of them can also calculate and populate the value for elements. Some of the simpler constraints can also be captured in the metamodel without the need for extensions to support complex constraints. These constraints can be mandatory attributes, cardinality, or predefined (default) values.
You use transformation for mapping the data and the structure, representing the source instance, into a new structure and data representing the target instance. Both source and target in a transformation can be a model or any other artifact. In DSL terms, transformation is a mapping operation between different domains or between a domain and other artifacts. Transformation is often required because the instances (model or artifact) are in a different domain, in a different format, or in a different version. Practically, transformations can be used for various purposes, such as:
- To generate code from a model
- To map one or more models into another (one-to-one or many-to-one)
- To generate a model from code artifacts
A few examples of transformation between instances of models, with the same metamodel, when using transformation include:
- Enriching or decorate the source model
- Initializing the source model
- Changing the data in the source model (including version updates)
For use as an example, a good set of transformations are in the development process for GMF applications. The GMF dashboard (Eclipse view) even gives a visual representation of the transformation steps, the artifacts, and their relationships.
Developing a tool for a DSL/DSM is a highly iterative process, described below.
- Develop the metamodel. The metamodel can be captured in various formats, such as XML schema (XSD), annotated Java code, ECore model, or Rational Rose® UML model.
- Generate basic tooling results in a simple tree-like editor. This editor is already a usable tool, without fancy features, for basic editing needs.
- Test the model using the basic tooling. Repeat the steps from step 1 until the expected results are achieved.
- Create and specify the tooling models by:
- Creating the tool palette (.gmftool).
- Creating the graphical elements together with their visual representations (.gmfgraph).
- Creating the mapping between the domain model, the graphical and visual elements, and the tool palette (.gmfmap).
- Generate the tooling from the models specified in the previous steps.
- Test the graphical modeling tool.
Repeat from step 4 until the expected results are achieved.
At this point a graphical modeling tool is available based on some metamodel. If the modeling tool produces the output, which can be simply a persisted model instance, then there is nothing more to do. If the model is expected to be the basis for artifacts generated, most likely a set of transformations has to be developed.
The following figures are from the Mindmap example shipped with the GMF Eclipse project. They show a typical set of models and project artifacts for developing GMF tools. Figure 3 shows the Mindmap tool.
Figure 3. .ecore model for the Mindmap tool

Figure 4 shows the tool palette definition.
Figure 4. Tool palette definition

Figure 5 shows the graphical element and shape definitions.
Figure 5. Graphical element and shape definitions

Figure 6 shows mapping the definition between the .ecore model, the tool palette, and the graphical nodes and shapes.
Figure 6. Mapping definition between the .ecore model, the tool palette, and the graphical nodes and shapes

Figure 7 shows project details and artifacts.
Figure 7. Project details and artifacts

Figure 8. GMF Dashboard with all the models

Development and run-time environment
Eclipse is the development environment for DSLs and DSMs using EMF and GMF. IBM's Rational Software Architect includes all the Eclipse projects necessary for developing tooling for DSLs and DSMs. The relevant projects include EMF, GEF, GMF, UML2, and EMFT. Eclipse Modeling Project has details for these projects. The results of the tool development described earlier are Eclipse plug-ins. In essence, the run-time environment for the tools is the Eclipse workbench.
Two forms of using the plugins developed primarily for modeling:
- Installed on top of the existing Eclipse workbench, making it an integral part of the development environment.
- Installed as a stand alone plug-in, still using the Eclipse workbench, but running separate from the rest of the development tools on top of the Rich Client Platform.
Domain Specific Languages and Domain Specific Modeling are powerful instruments in the toolbox of many different roles in IT, and even beyond. Through the use of DSL and DSM, IT architects can achieve a clear comprehension of the domain they work with and can become enabled to better communicate the details of the domain more effectively. Keep in mind, though, that DSLs and DSM are not the answer to every problem—their applicability depends on how well the language and the tool are built.
Learn
-
The IBM Systems Journal article "Rational Software Architect: A tool for domain-specific modeling" provides both an overview and details.
-
"Improving Developer Productivity with Lightweight Domain Specific Modeling" (April 2006) explains how to improve developer productivity.
- EclipseCon has all the materials for the premier technical and user conference focusing on the power of the Eclipse platform.
- The Eclipse Modeling Framework (EMF) overview presents a basic overview of EMF and its code generator patterns.
- From the eclipse Eclipsepedia:
-
"Discover the Eclipse
Modeling Framework (EMF) and Its Dynamic Capabilities" explains just what the EMF is and takes a look at the basic architecture.
From Eclipse.org:
- The Eclipse Modeling Framework (EMF) Validation Framework Overview discusses constraints.
- "Implementing Model Integrity in EMF with EMFT OCL" (Feb 2007) illustrates how the MDT OCL parser/interpreter technology adds to the value of EMF/JET code generation as a foundation for model-driven development (MDD).
- "From Front End To Code - MDSD in Practice" gives you advice about how to tackle several MDSD challenges, based on a collection of open source tools: Eclipse, Eclipse Modeling Framework (EMF), Graphical Modeling Framework (GMF) as well as openArchitectureWare.
- "Building a Database Schema Diagram Editor with GEF" (Sep 2004) uses a relational database schema diagram editor with a deliberately simplified underlying model, but with enough bells and whistles, to show some of the interesting features of GEF at work.
- "Using GEF with EMF" (Jun 2005) builds upon the shapes example provided by GEF using the Eclipse Modeling Framework (EMF) and provides an introduction using EMF-based models in GEF-based editors.
Companion articles from IBM developerWorks:
- Implement model-driven development to increase the business value of your IT system
- Combine patterns and modeling to implement architecture-driven development
- Explore model-driven development (MDD) and related approaches: A closer look at model-driven development and other industry initiatives
Other related developerWorks articles:
- "Learn Eclipse GMF in 15 minutes" is an excellent introduction to GMF and related Eclipse projects (developerWorks, Sep 2006).
- "Create more -- better -- code in Eclipse with JET" provides an introduction to the code-generation framework, JET, an Eclipse technology project (developerWorks, Aug 2006).
-
HP Dev Resource Central's article,
"Graphical editor for
XML documents" describes an approach to creating a graphical editor for XML documents using a schema.
-
"The
Pragmatic Code Generator Programmer" (Sep 2006) reimplements an exercise taken from the book The Pragmatic Programmer, by Andy Hunt and Dave Thomas.
- MDA Distilled: Principles of Model-Driven
Architecture, by Stephen J. Mellor, Kendall Scott, Axel Uhl, Dirk Weise (Addison
Wesley Professional, March 2004) ISBN: 0-201-78891-8
-
From IBM Redbooks®Eclipse Development using the Graphical Editing Framework and the Eclipse Modeling Framework, SG24-6302, examines two frameworks that are developed by the Eclipse Tools Project for use with the Eclipse Platform: the Graphical Editing Framework (GEF), and the Eclipse Modeling Framework (EMF).
- IBM on demand demos to learn about various software products and technologies from IBM.
- Stay current with
developerWorks technical events and webcasts.
- Browse the
technology bookstore
for books on these and other technical topics.
- developerWorks Live! Technical events and briefings
Get products and technologies
- Visit Simultaneous release projects from Eclipse and download free
Eclipse project bundles and IBM Rational tools from one convenient location.
- Trial download: Rational Software Architect.
- Download
IBM product evaluation versions
and get your hands on application development tools and middleware products from
DB2®, Lotus®, Rational, Tivoli®, and WebSphere®.
Discuss
- Participate in the discussion forum.
- DSM Forum: The link to publications might be especially useful.
- Check out
developerWorks
blogs and
get involved in the
developerWorks community.

Peter Kovari is part of the IBM Software Group Services organization in Hursley, UK. His responsibilities span from specialist areas to various IT architectures. He is often traveling to customer locations to help clients with the adoption of IBM's software portfolio. Formerly, Peter worked for ITSO in Raleigh, NC, as an IT specialist, project leader, and technical author writing IBM Redbooks® with other IBM professionals from all over the world.
Comments (Undergoing maintenance)





