Skip to main content

Working XML: Take advantage of lessons learned by refactoring XM

Also, create an incremental builder in Eclipse

Benoit Marchal (bmarchal@pineapplesoft.com), Consultant, Pineapplesoft
Photo of Benoit Marchal
Benoît Marchal is a Belgian consultant. He is the author of XML by Example, Second Edition and other XML books. You can contact him at bmarchal@pineapplesoft.com or through his personal site at www.marchal.com.

Summary:  In this article, Benoît continues to work on a new version of XM, the simple content management solution that's based on XML and integrated with Eclipse. Benoît discusses issues faced while refactoring code and shows you how to create an incremental builder in Eclipse. Share your thoughts on this article with the author and other readers in the accompanying discussion forum. (You can also click Discuss at the top or bottom of the article to access the forum.)

Date:  30 Nov 2004
Level:  Advanced
Activity:  1389 views

I can think of few tasks in software development that are less glamorous than refactoring code. Yet it is a crucial activity for the ongoing maintenance of a project.

One of the original premises behind this column was to develop projects that spanned several years to face the challenges typical in real-world projects. Refactoring is one of those challenges.

While preparing this column, I also spent time working on the Eclipse build process to make sure it integrates smoothly with the user interface and, more specifically, with the error reporting.

Refactoring

I hate to throw away working code. With this in mind, I try to refactor the applications I develop (or mentor or project manage) regularly.

Build on the experience

In the early stages of most software projects, the goal is usually to get something running. The "something" may be ugly, but it tends to reassure management, the team, and the users if you have something worth demonstrating. The next step is to stabilize the software by rounding out the feature set and fixing bugs.

Yet the most important (and time-consuming) phase is maintenance. It is throughout the maintenance phase that the software evolves the most. As the original design is stretched in unexpected ways, it is not uncommon for developers to grow frustrated with their original code and initial design. Indeed, I often hear from developers who claim that what they wrote just last year or the year before is worthless and should be thrown away.

However, I don't like to throw away working code. This code embodies a lot of valuable experience on special situations, edge cases, and unorthodox needs that oftentimes have not been recorded anywhere else. In most cases, it is more painful to have to learn those lessons again rather than to work from the existing base.

At the same time, I can understand the complaints against the existing code. Often, the application has evolved to the point where developers spend more time fighting the original design than benefiting from it.

Faced with this situation, it can seem as if fresh code is the only solution. However, I prefer to refactor existing code instead. By refactoring, I mean carefully crafting a new design on those areas of the code base that will benefit most from it. Tool makers increasingly incorporate refactoring support into their products (and, indeed, Eclipse has a refactoring menu full of goodies), but it is still a process that requires manual steps. Here's my approach.

Refactoring XM

As I discussed in my previous column, XM has already had a long life. The project opened the Working XML column in July 2001. It has evolved to support Eclipse and many specific needs in different projects.

As a result, there are no less than four code bases that could be considered XM: the original XM code (posted on ananas.org) and three sets of code that offer similar, but slightly different, functionalities on a different design. The goal of this ongoing exercise is to unify the different versions.

In my previous article, I also discusses how I plan to evolve the existing design to abstract XM from the file system.

In practice, refactoring typically involves:

  • Breaking up large units (packages, classes, or interfaces) into smaller ones
  • Abstracting more concepts behind new interfaces
  • Renaming concepts to better express their use

In addition, the process occasionally includes combining concepts that experience shows are pretty similar. For example, I realized that Messenger was mixing up two different concepts -- reporting information to the user and localization -- so I designed two new interfaces (MessageListener and ResourceHolder) from which the new Messenger inherits.

What I refrain from doing is writing new code. I prefer to reuse as much code as possible to minimize the amount of bugs that get introduced into the process. When I do need to write new code (to support new concepts, such as the new abstractions), I isolate it into its own classes.

I find it more practical to create a brand new project in Eclipse and move existing classes that I have identified as stable during the analysis into the project. I use Eclipse refactoring features (such as the ability to propagate a package change) to rename some of those classes. (See Figure 1.)


Figure 1. Refactoring help in Eclipse
Refactoring help in Eclipse

Then I turn my attention to classes that need more change, such as Messenger. I copy the existing file in the new project and then cut and paste code until it is reorganized as desired.

Frequently, when copying code, I unearth unexpected dependencies. The compiler is my best ally since it signals an error until I have resolved those dependencies. Often, that means more breaking up of units, more abstraction, and more renaming.

Practical considerations

It takes time to find the gems in existing code and reorganize around them. Plan accordingly (I didn't do a good job with XM -- apologies to John, the developerWorks editor), and understand that it takes even more time to write fully debugged, stable code from scratch.

Refactoring is a means to an end. In the middle of a tedious session, it helps to remember that the goal is a design evolution that will let you add new features to the application. I find it also helps to keep iTunes loaded with plenty of good music.

Note that refactoring scales from a class to an entire project. In this particular case, the changes are so widespread and the project is small enough that I choose to start with a brand new project. On larger projects, refactor one package at a time. However, the procedure is the same: Move the existing code outside of the project and re-introduce it gradually.

When do you need to refactor? Ideally, as soon as you find that more time is spent working around the existing design rather than with it.


The Eclipse build process

I have already discussed the Eclipse build process in past columns (see "Building a project with Eclipse and XM"), but at the time I ignored Eclipse resource management. This column revisits the build process and implements a functional incremental builder.

Project description

Building a project starts with the build specification in the project description. The project description file (.project) is located in the project folder. (Since it's a ".something" filename, it is invisible under UNIX.) Listing 1 shows the relevant excerpt.


Listing 1. Project description excerpt
				<buildSpec>
   <buildCommand>
      <name>org.ananas.xm.eclipse.builder</name>
      <arguments></arguments>
   </buildCommand>
</buildSpec>

The <buildSpec> tag contains a list of builders (and optional parameters). The user can edit the project description through the Properties dialog box located under File|Properties.

Most plug-ins also offer a more convenient interface for editing the buildSpec. For example, you might recall the original XM plug-in used a wizard and a custom configuration panel to configure the buildSpec (see my earlier column "Layout, properties, and preferences in Eclipse").

Builder declaration

The name in the buildSpec points to an extension point declared in the plugin.xml file. The builder extends org.eclipse.core.resources.builders and associates it with a class (org.ananas.xm.eclipse.XMBuilder in the example -- see Listing 2).


Listing 2. Plug-in declaration excerpt
				<extension id="builder"
           name="XM Builder"
           point="org.eclipse.core.resources.builders">
   <builder>
      <run class="org.ananas.xm.eclipse.XMBuilder"/>
   </builder>
</extension>

The builder class inherits from org.eclipse.core.resources.IncrementalProjectBuilder and implements the build() method.

Extension points and classes

A word of warning: It is easy to confuse extension points and classes in Eclipse. Although they look similar, they are different concepts.

Extension points appear in a plug-in description (in the file plugin.xml) and specify how the plug-in extends the Eclipse platform. Every extension point has a unique identifier.

Confusion arises because extension point identifiers are built on domain names in reverse order, like a package name. Examples of extension points include org.eclipse.core.resources.builders and org.eclipse.ui.propertyPages.

Consequently, you may think that an extension point is a Java class or a Java interface. It is not. The extension point is a completely abstract concept. (It helps to think of extension points as ports to which plug-ins can connect.)

The plug-in descriptor contains enough information that the platform does not need to load Java classes until the last minute. For example, the platform can add menu entries just by looking at the descriptor. It is only when the user selects the entry that it needs to load a class. This helps to improve performance, since the platform only loads a small amount of code at any one time. Extension points fulfill that need.

Still, there is a relationship between extension points and Java classes: Most extension points are implemented through specific Java classes in the plug-in. The classes must also implement an appropriate interface for the extension point.

Builder implementation

The builder implementation is shown in Listing 3. It inherits from IncrementalProjectBuilder and implements the build() method.

build() takes three arguments:

  • The type of build being requested -- incremental, automatic, or full
  • A map with the build arguments
  • A progress monitor -- essentially a progress bar in the UI

Incremental and auto builds are very similar, the only difference being that an incremental build is requested by the user explicitly, whereas the auto build is launched automatically when the user saves a file. (Whether to use incremental or auto builds is an option in the Project menu.)

Eclipse attempts to track changes to the resources and pass the builder a delta -- a list of resources that have changed since the builder last ran. Note that Eclipse does not guarantee that it will be able to pass a delta. To save memory, Eclipse may discard the delta at any point. If a delta is not available, the builder should perform a full build.

The method returns a list of projects for which the builder would like a delta the next time it runs. For builders working across multiple projects, this may speed up incremental builds.

build() first checks that the project is open and accessible, then it tests on the nature of the build. In practice, it makes no distinction between incremental and automatic.

Visitors

To process the resources, the builder uses the visitor pattern. The visitor pattern has been documented in the book Design Patterns (see Resources). The visitor pattern implements a walk through a hierarchy in which the visitor specifies an operation to apply to the nodes. However, the visitor does not need to know anything about the node hierarchy.

The visitor pattern is illustrated in Figure 2. The node implements the accept() method, which takes a visitor as a parameter. accept() immediately calls the visit() method on the visitor, passing the node as a parameter. During the call to visit(), the visitor performs the appropriate operation on the node. Finally, the node loops through its descendants and calls their accept() method with the visitor. This ensures that the visitor will visit the entire hierarchy.


Figure 2. The visitor pattern
The visitor pattern

A builder must implement two visitors, IResourceDeltaVisitor and IResourceVisitor, to implement an incremental and a full build, respectively. In XMBuilder, the visitors are implemented through a single inner class, Visitor.

The implementation of the delta visitor (method visit(IResourceDelta)) filters out deletion. If the calls correspond to a new resource or a change in an existing resource, it calls visit(IResource).

The visit(IResource) method is where the actual building logic resides. Typically, this method is very simple and calls into a compiler or interpreter. For XM, BatchSupervisor plays the role of the compiler. (If you have been working with the old XM, BatchSupervisor is similar to MoverSupervisor in the old code.) In its current incarnation, BatchSupervisor simply applies a stylesheet to the resource.

Note that before calling the compiler, the visitor tests two flags on the the resource: phantom and team private. Phantom resources are temporary files; team private resources are configuration files that should be invisible to builders. For example, version control (CVS or Subversion) mark their configuration files as team private. Indeed, you don't want the builder to style versioning data.

While I'm on the topic of flags, the visitor also flags the target folder (where it places output files) as a derived folder. This is an indication to version control that the folder has been generated from other files in the project. Therefore, it can be regenerated, and, to limit the archives, it should not be version controlled.

Reporting errors

In the previous installment, I discussed how to report errors in the task list by attaching a marker to a resource. This requires the ability to identify the resource properly. The problem is that the builder needs to call tools, such as compilers or, in this case, XSLT interpreters, that don't use Eclipse resource objects, but system identifiers (URLs) or Java File objects. Listing 4 shows how to reload a resource from a file path.


Listing 4. Plug-in declaration excerpt
				public static Filename createFilenameFromSystemID(String systemID)
{
   IPath path = new Path(systemID);
   IResource resource = workspaceRoot.getFileForLocation(path);
   if(null == resource)
      resource = workspaceRoot.getContainerForLocation(path);
   if(null != resource)
      return new EclipseFilename(resource);
   else
      return null;
}


Conclusion

One of the original premises of the Working XML column was to tackle larger projects that could be easily covered in a standard article in order to address real-life problems, such as the refactoring. I hope I have convinced you of the value of your existing code base and that you'll consider refactoring the next time you feel like starting over.


Resources

About the author

Photo of Benoit Marchal

Benoît Marchal is a Belgian consultant. He is the author of XML by Example, Second Edition and other XML books. You can contact him at bmarchal@pineapplesoft.com or through his personal site at www.marchal.com.

Comments (Undergoing maintenance)



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=XML, Open source
ArticleID=32035
ArticleTitle=Working XML: Take advantage of lessons learned by refactoring XM
publish-date=11302004
author1-email=bmarchal@pineapplesoft.com
author1-email-cc=dwxed@us.ibm.com

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Rate a product. Write a review.

Special offers