A practical approach to code modification and duplication
The difference between configuration-driven development and model-driven development is that the former is not restricted to the model of the code such as classes, fields, and relationships. Configuration-driven development (CCD) encompasses anything that can be configured within your application. For example, if your architecture dictates that particular business rules must be applied consistently across your application, you can use configuration files to configure and apply those rules.
In this article I'll introduce you to configuration-driven development and explain how it can resolve code duplication and modification problems.
Code duplication and modification
Imagine that you are working on an application that consists of the following components:
- A database
- A middleware server with a Web services application program interface (API)
- A middleware server with a Web-based user interface
- A thick client using the middleware API
Figure 1. A simple parameter
As you can see in Figure 1, a simple parameter, such as the length of a string, will impact all four components. It will also impact the following user documentation and unit test areas:
- The thick client
- The Web-based user interface
- The Web services API
Unit test areas:
- The database
- The Web services API
- The Web-based user interface
- The thick client
Figure 2 illustrates the total impact of the simple parameter shown in Figure 1.
Figure 2. A plethora of dependencies
Suddenly, something as simple as a string length has been duplicated in not just four, but ten different locations. The string length parameter is only an example; many types of business rules can similarly affect a typical application. Some rules are common to almost any application, such as string length and numerical minimum and maximum values. Other rules are specialized to fit the needs of the particular application. Does the application use a check-out, check-in mechanism to prevent concurrency? Are there rules about which information is pulled from the client and pushed from the server? All these factors come into play, and could easily make mountains of the most simple code modifications.
Information duplication is not a new problem and many tools and techniques already exist to prevent it. In this section I'll consider some of the most common cures for information duplication.
- Development process
- Some development teams resolve information duplication by making redundant information modification part of a strict development process. This solution can be tedious, since it requires supervision and reviews, but it is effective.
- Well-designed code
- Well-designed code coupled with re-use of constants can reduce code duplication problems. Code-driven solutions work best when all the parts in your application are written in the same language.
- Model-driven tools
- The concept of model-driven development is to read the application model as
a configuration. The biggest advantage of model-driven tools is that they automate tedious tasks related to the objects and their relationships. Here's a rundown of popular tools for model-driven development:
- The Eclipse Modeling Framework (EMF): Stores the layout of your classes and fields. It generates Java classes, network interfaces, and even database mappings for your application. By generating objects, EMF automates the tedious part of writing methods such as getters, setters, equals, copy, clone, serialization, and others. EMF uses configuration files that can store many object definitions. In multi-user environments, merging those files can cause a few problems. EMF is restricted to the objects of the application, their relationships, and their methods. It doesn't provide any help for handling custom business rules.
- UML2: Describes and illustrates the application through diagrams of class relationships, fields, and logic. The advantage of UML is that it is language neutral. Once the logical design is done, it is theoretically possible to generate objects in many languages.
- Ruby on Rails: Extracts the configuration of your model from the database schema. It then generates scaffolding to handle business logic, a Web-based user interface, and unit tests. The major productivity gain lies in the ease with which database schema changes are propagated in the code.
How CDD works
I've already described the basics of configuration-driven development. To better understand how it works, consider an example drawn from the real world. In this section I'll describe the configuration-driven development solution my team adopted for developing parts of the Rational Portfolio Manager.
Store the configuration in XML files
In configuration-driven development, developers make all modifications primarily in XML files. All other files related to the application read their configuration from those files, either at runtime or by having selected parts generated at build time. In the case of the Rational Portfolio Manager, we stored the following components and information in configuration files:
- Their relationship
- Their documentation
- Their validation rules
- Their behavior in the check-in check-out mechanism
- Their restrictions in the application security framework
- Their database mapping
- Their place in the visual layouts
- Their unique identifiers
- Their documentation
- The messages they generate at runtime
- The parameters they use
Web services interface definition
- The methods that are exposed
- The documentation
- The parameters they use with their validation rules
- Their restrictions in the application security framework
Figure 3 shows a typical configuration-driven build process.
Figure 3. A sample configuration-driven build process
The following tools are essential to configuration-driven development:
- Tools for editing
- In most cases a simple text editor is sufficient to edit and evolve the XML files. A good XML editor will validate the syntax as the file is being modified and simplify the edition of XML tags.
- Tools for reading
- To get some leverage from the XML files, you need a tool that can pump the XML files straight into your application. For example, you might use the the Java library Betwixt to fill your Java files with the context of your XML configuration.
- Tools to generate artifacts
- Once you read your configuration files, the next logical step is to seed the other parts of your product with that information. You have a few options at this point. The first is to embed the configuration files in the software itself so that they are read at runtime. Another way is to generate code and documentation using tools such as the Java Velocity engine from Apache Commons.
Rules to live by
Most application developers are familiar with the following rules, although you apply them differently in configuration-driven development than you do in Java development or extreme programming.
1. Keep it simple
The configuration files must be easy to understand and to evolve. While this sounds obvious, experienced XML users commonly use advanced features that don't mesh well with the CDD approach, such as syntax that renders the XML difficult to read and understand.
2. Evolve as required
No predefined XML layout is going to comply with every developer's needs. The solution to this problem is to adapt your XML layout to fit your needs. Depending on the domain or the software architecture, the XML attributes used in classes or field definitions can vary a lot. Keep in mind that this evolution will prevent users from breaking the "keep it simple" rule. In the above example, it's easy to find that there are many configuration parameters are useless in the context of almost any other product. No tools on the market could have simplified their implementation.
3. Validate early and often
Common sense dictates that errors caught earlier in the development process are the cheapest to resolve. Following this principle, it makes sense to validate your configuration as early and as extensively as possible. For example, you might use an XSD or DTD file to validate the XML structure of configuration files. If you need to apply custom validation rules, don't hesitate to implement your own validation tools. While the time spent to write those tools is not time spent working on the final product, it is a good investment.
Benefits, costs, and limitations
Before you adopt a new approach, it is a good idea to have in mind an overview of its benefits, as well as what it will cost you and what you shouldn't expect from it. This section is a synopsis of all three.
- Duplication reduction
- The first benefit of this technique is evidently the reduction of information duplication, which enhances maintainability and the overall quality of the product.
- No vendor lock-in
- By solely using basic tools to edit XML, you do not tie yourself to any vendor-specific tools. You can find plenty of open source tools available to read and edit XML files.
- Source control
- Some solutions on the market store their outputs in proprietary XML formats that are almost impossible to merge. Reference between those XML files also causes problems. It is inefficient to have only one team member with the rights to modify a configuration file at any given time. Human-edited XML files have the advantage of working hand-in-hand with source control tools such as CVS, Subversion, or Clearcase in multi-user environments.
- The right tool for the right job
- Most tools on the market address some very common needs, but every project has specific needs. This makes it difficult, or even impossible, to find tools to match all those needs. Custom configuration files have the advantage of containing only the information which is relevant to your project.
- Technology independence
- Some tools out there use configuration files to isolate your application from the underlying technology. For example, Hibernate stores the relationship between the database and objects in configuration files, which in turn isolates the user from the vendor-specific database implementation. While this independence is rarely perfect, the technology abstraction is usually viewed as positive as it can help evolve the application in the future.
- Putting the structure in place
- The initial cost to put the structure in place is not negligible. Even with the right tools, problems are bound to happen. Configuration-driven development is mostly suitable for medium and large projects.
- Build-process complexity
- When you generate parts of your application from configuration files, the build process can become complicated. With proper build automation, this price remains relatively low.
Limitations and trade-offs
- Business rules complexity
- Basic concepts are easily mapped to configuration files, but complex business rules are entirely another matter. If complex rules come back very often in your application with small variances, then it is possible to create configuration files that only store those variances. For efficiency's sake, it is best to leave the complex business rules to the code itself and keep repetitive concepts in configuration files.
- Infrastructure cost
- For small projects, the cost of putting the infrastructure in place might be larger than the cost of the project itself. Besides, small projects don't usually suffer from information redundancy.
Sample XML code
Listing 1 shows a sample XML file that represents a resource (or user) structure. Here I list of a few attributes in the sample XML code:
- Class: The name of the Java class
- Extends: The name of the parent Java class
- Abstract: Indicates whether the class is abstract or not in Java terms
- TestReady: Indicates whether the code generator should generate a unit test for this class
- Field name: The Java variable name for that field
- Field type: The Java type used for the field
- Field label: Label to use on the user interface for a specific field
- Field min and max: Minimum and maximum length of strings or values for numbers
- Field default: The default value to apply to a field on object creation
- Field composite: Indicates whether the field is a reference or a composite relationship
- Field valid types: Indicates the valid types that can be held in an array of abstract objects
- Field mandatory: Indicates whether the field is mandatory when the object is created
- Field readable: Indicates whether users can read the field
- Inherited field overrides
Listing 1. Sample code that represents a resource (or user) structure
<?xml version="1.0"?> <object class="sample.Resource" abstract="false" extends="sample.BaseObject" testready="false" documentation="A resource represents a user of the system."> <rule type="create" documentation="To create a resource, the user must have the administrator security rights" /> <rule type="update" documentation="A resource has the right to modify itself. Only administrators have the right to modify resources otherwise." /> <rule type="delete" documentation="Administrators have the right to delete a resouce." /> <field name="active" label="Active" type="java.lang.Boolean" default="true" /> <field name="calendar" label="Calendar" type="sample.WorkCalendar" composite="false" /> <field name="contactGroupAssignments" type="sample.ContactGroupAssignment" composite="true"> <valid-type class="sample.ContactGroupAssignment" /> </field> <field name="name" label="Full Name" type="java.lang.String" mandatory="true" max="35" /> <field name="password" label="Password" type="java.lang.String" min="3" max="16" readable="false" /> <!-- BaseObject --> <inherited name="parent" mandatory="true"> <valid-type class="sample.ResourceFolder" /> <invalid-type class="sample.Project" /> </inherited> </object>
Using this configuration file, it is possible to generate:
- A database layout
- A Web services interface
- Java model classes
- User documentation
- A simple user interface that uses the labels and embeds documentation for tooltips and help files
- Unit test frameworks for each attribute and rule in the configuration
Ideally you can build entire products using configuration-driven development. The development process consists of two phases:
- Build the abstract tools you require.
- Build your application using configuration files that indicate how the tools link together to form the finished product.
While configuration-driven development is not a radically new idea, getting it to work efficiently in a typically constrained modern work environment can be challenging. In this article I've proposed a simple and efficient way to achieve a functional and successful configuration-driven development process.
- Explore model-driven development (MDD) and related approaches (Tracy Gardner and Larry Yusef, developerWorks, March 2006): Learn about model-driven development in the context of related initiatives.
- Ease your multithreaded application programming (Ze'ev Bubis and Saffi Hartal, developerWorks, February 2002): Review a code-driven approach to resolving code duplication.
- Automation for the people: Continuous Inspection (Paul Duvall, developerWorks, August 2006): Learn from an automation expert how to use inspectors like CheckStyle, JavaNCSS, and CPD to catch duplication errors.
- Download Betwixt: Download and fill your Java files with the context of your XML configuration files.
- Download Velocity: Download and generate code and documentation artifacts.
- developerWorks Web development zone: See more articles from developerWorks on Web development technologies.
- IBM trial software: Build your next development project with these tools, available for download directly from developerWorks.
- developerWorks Web development Downloads and products area: Find more free downloads.