Level: Introductory Wayne Beaton (wbeaton@ca.ibm.com), Senior Software Consultant for AIM Services, IBM Software Group
19 Jul 2001 This article discusses some software development best practices to mitigate the expense of code migration.
Software migration is a process of taking an application developed in one language
or platform and moving it to another language or platform. In the world of enterprise
Java™ application development, software migration is a process of moving an application
between competitive application servers or versions of the same application
server. The standards specified by Java 2 Enterprise Edition (J2EE) go a long
way toward removing subtle distinctions between deployment platforms, making
migration far easier than ever before. In a perfect world, J2EE applications
could easily be moved from one application server to the next without modification.
We don't live in a perfect world.
Migration is a huge topic. This is first part of a three-part article that
addresses designing for change. This first installment discusses some software
development best practices to mitigate the expense of code migration. The second
installment presents a general plan of attack for effecting migration. The third
installment discusses several common reasons for undertaking migration, along
with advice for making the process as easy as possible.
J2EE, standards and proprietary
solutions
The evolution of standards like J2EE and the Java language itself have done
a lot to make migration a relatively easy task. That's one of the primary benefits
of standards. But standards take time to evolve. In the meantime, solutions
are required for a set of problems that grows at an extremely fast rate. In
the early days of server-side Java, for example, a number of different HTML/Java
hybrids were created to solve a problem that was eventually solved by the JavaServer
PagesTM (JSP) standard.
J2EE packages together a large set of specifications that provide a huge set
of services for server-side application development. A J2EE-compliant server
provides almost everything you need to build an enterprise application. As comprehensive
as J2EE is, however, it does not cover everything that you could ever possibly
care about. In the rapidly evolving world of Java development, it is impossible
for any one set of standards to cover everything. Needs change too quickly.
For example, portals and personalization are not currently governed by standards
but are an important part of many applications. Proprietary solutions are developed
to fill gaps in the standards. Sometimes these proprietary solutions evolve
into standards; more often, complementary standards are developed.
You may have chosen your application server, database, middleware, or whatever,
based on some proprietary services that are provided. You may be moving to IBM
WebSphere® to take advantage of some of its proprietary services. These services
are very often what distinguish competitive products from one another. Proprietary
solutions are not inherently bad, but can tie you to a specific vendor and make
future migration and change dependent on that vendor's continued support of
those solutions.
Even services covered by standards are not completely safe. As part of J2EE,
Java Database Connectivity (JDBC) provides a standard mechanism for accessing
relational databases from Java. JDBC encourages the creation of standard Java
code, but does little to stop you from taking advantage of specific database
extensions. The JDBC "standard" is only effective at insulating you
from the specifics of the actual database if you use it with care to avoid proprietary
extensions.
J2EE is still pretty new. In fact, the number of application servers that fully
support J2EE is relatively small. The number of deployed applications that make
full use of J2EE is smaller still (relatively speaking). A lot of the applications
in production today run on first generation application servers that couldn't
wait for standards to evolve. These first generation application servers had
to make up a lot of their services as they went along.
One first generation application server, ATG Dynamo, implements server-side
applications with JHTML, Droplets, FormHandlers and a proprietary database access
mechanism. While this application server has been upgraded to support some of
the recent standards, many applications built on this server have not kept pace
with the standards. If you're migrating from ATG Dynamo to IBM WebSphere Application
Server, all is not lost; a lot, perhaps most, of your code will likely migrate
without modification. Entire business model classes may migrate without a single
change. Some servlets may require only minor modification. At very least, individual
methods and blocks of code will likely be reusable. Still, even though a lot
of the code should migrate with only minor modifications, the migration effort
is going to take considerable effort.
You need a strategy for taking advantage of proprietary services while reducing
the cost of changing your solution later. Designing for change is a key element
in that strategy.
Preparing for migration
So how do you prepare for migration? The answer is based in classical software
development best practices that transcend Java and even object-oriented programming.
The software development community has known the answer for a long time. Essentially
what it all boils down to is this: reduce coupling within your application.
Coupling is a measure of how much each part of your application code knows about
the other parts. By keeping coupling to a minimum, you isolate the bulk of your
code from the effects of change. The following best practices will help to reduce
the cost of change:
- Conform to standards as much as is possible (in this case, J2EE).
- Avoid proprietary extensions. If you cannot avoid proprietary extensions,
abstract their use to insulate the bulk of the code from future change.
- Structure the application into layers.
- Write and maintain automated tests (these will help during development and
migration).
Best practice: conform to standards
Standards can be tricky because there are so many to choose from. Standards
are only as solid as the community that supports them. J2EE, from all appearances,
is a good standard to put faith in. However, it is important to approach the
use of standards with care. As mentioned previously, even standards like JDBC
can get you into trouble.
Perhaps more important than just conforming to standards, is adopting a philosophy
of keeping pace with standards. Standards often go through multiple versions
which can make conformance more difficult. As J2EE evolves, new features are
added and some older ones are retired. The retirement cycle is usually lengthy
providing lots of time for application developers to catch up. In Java parlance,
types and methods that are due to retire are marked in javadoc as "deprecated."
Development tools, such as VisualAge® for Java, help you keep up with standards
by identifying the use of deprecated code.
Make conformance to standards a high priority in your development plan -- not
only for initial development, but as a long-term development plan. As the standards
are updated, work time into your development plan to update your code to maintain
conformance to those standards.
Best practice: abstract, abstract, abstract
Abstractions are a way of putting a single, consistent face on functionality.
Rather than letting every part of your application make direct use of some technology,
make those parts access the technology through an abstract layer. If you later
need to change the technology in question, you only have to change the code
under the abstraction and everything that makes use of the abstraction should
just work.
The amount of effort required to effect a recent migration to IBM WebSphere
Application Server was compounded by the requirement to change from an Oracle
database to DB2®. Further, the competitive application server did not support
JDBC connection pooling using data sources, so the developers built their own
connection pooling solution. Unfortunately, their connection pooling solution
was not consistently used.
In the early days of JDBC, the DriverManager class was used to get a connection
to the database. DriverManager was originally intended for use by applets and
standalone applications that generally required only a single connection. With
server-side Java, things change. As multiple threads concurrently hammer a server,
there may be a need for hundreds or thousands of connections.
Database connections are an expensive resource to build and maintain. A number
of separate efforts were undertaken to provide a pooling mechanism which would
allow multiple threads to share a relatively small number of connections. IBM
WebSphere Application Server, Version 2.0, for example, had a proprietary implementation
of connection pooling. DataSource is a relatively new addition to the JDBC specification
that provides connection pooling.
At least eighty direct references to DriverManager were spread thoughout the
code. Code fragments, similar to the following, were repeated throughout the
application:
DriverManager.registerDriver(new oracle.jdbc.driver.OracleDriver());
String database = DbParameters.getDbName();
String user = DbParameters.getDbUserName();
String password = DbParameters.getDbPassword();
Connection connection =
DriverManager.getConnection ("jdbc:oracle:oci8:@"+database,
user, password);
PreparedStatement statement =
connection.prepareStatement("select balance from account where id = ?");
statement.setInt(1, accountId);
... |
With the change of database driver, all eighty-odd references had to be changed.
In fact, this code shows the even bigger problem: a custom connection pooling
mechanism is available, but this code has not yet been updated to make use of
it. Imagine instead the impact of a database driver change if the code where
structured something like this:
...
Connection connection = getConnection()
PreparedStatement statement =
connection.prepareStatement("select balance from account where id = ?");
statement.setInt(1, accountId);
... |
Rather than repeat the code many (or in this case eighty) different places,
the actual code to build the connection is moved to where it can be shared.
The getConnection() method is implemented as
follows:
public Connection getConnection() throws SQLException {
DriverManager.registerDriver(new oracle.jdbc.driver.OracleDriver());
String database = DbParameters.getDbName();
String user = DbParameters.getDbUserName();
String password = DbParameters.getDbPassword();
return DriverManager.getConnection ("jdbc:oracle:oci8:@"+database,
user, password);
} |
In Martin Fowler's book on Refactoring [Fowler1999], this is referred to as
an "Extract Method" refactoring. This change does not introduce significant
additional code, but adds a lot of value. Not only is the code more easily shared,
it is also more easily read and understood. By reducing the size (and complexity)
of the method, greater semantic meaning is attached (with help from the name
of the new method) which makes this code easier to understand and support. Additional
refactorings will make the getConnection() method
even more useful. By moving the responsibility to implement this method to another
class it can be more easily shared, further reducing code duplication.
If the bulk of the code consistently makes use of the getConnection()
method, introducing the custom-built connection pool is easy:
public Connection getConnection() throws SQLException {
return getConnectionPool().getConnection();
} |
The change to use the proprietary connection pool does not require any changes
to the code that uses the getConnection()
method. When the code is moved to a platform like IBM WebSphere Application
Server, standard JDBC connection pooling can easily be employed by updating
the getConnection() method yet again:
public Connection getConnection() throws SQLException {
return getDataSource().getConnection();
} |
Some additional code is required to fully support either connection pooling
mechanism. The point is that the complexity is hidden behind the abstraction
where the bulk of the code need never see it.
Best practice: structure the application into layers
Layering is a way of grouping an application's code into related subparts.
Each layer contains code that is specific to some function or purpose. A typical
example of layering is the so-called Model/View/Controller (MVC) architecture
which separates model, view and control into layers.
Layers provide many advantages. Most obvious is separation of roles and responsibility.
Content designers are responsible for JSPs, Java developers are responsible
for servlets, JavaBeansTM and EJBs. It goes deeper than that, however: as change
is introduced into an application (migration is just one example of change),
layering helps by isolating entire chunks of code into easier to understand
bits; further, these bits are easy to test. When layering is done well, the
entire business logic of an application can often be migrated with little or
no modification.
What a lot of people fail to understand is what the control layer should do.
It's easy to say that JSPs provide the view, servlets provide control and Java
classes or EJBs provide the model. But how far does "control" go?
Here's an example of a ATG Dynamo FormHander that blurs the distinction between
control and model:
public boolean handleSubmit(
DynamoHttpServletRequest request, DynamoHttpServletResponse response)
throws ServletException, IOException{
try {
BankAccount source = findAccount(getSourceId());
BankAccount destination = findAccount(getDestinationId());
source.debit(amount);
destination.credit(amount);
response.sendLocalRedirect(getSuccessURL(),request);
} catch (InsufficientFundsException e) {
response.sendLocalRedirect(getFailureURL(),request);
return false;
}
return true;
} |
Please forgive the obvious lack of transaction control and missing exception
handling. The problem is that part of the business process is captured here.
What's described by this method is the process of transferring funds from one
bank account to another. That's a business process, not control. Business process
should really be part of the model layer.
In MVC, "control" is a process of communicating between the view
and the model. The control layer should take care of extracting data, translation
of that data into something the model will understand, invoking behavior on
the model and translating the results of that behavior into something the view
can make use of. The "Facade" pattern [Gamma1995] is a popular way
of encapsulating business process, thereby removing that behavior from the controller.
public boolean handleSubmit(
DynamoHttpServletRequest request, DynamoHttpServletResponse response)
throws ServletException, IOException{
try {
getTeller().transfer(getSourceId(), getDestinationId(),
getAmount());
response.sendLocalRedirect(getSuccessURL(),request);
} catch (InsufficientFundsException e) {
response.sendLocalRedirect(getFailureURL(),request);
return false;
}
return true;
} |
The "teller" object is responsible for providing behavior in response
to the transfer(...) message. As part of that behavior, it is expected that
an InsufficientFundsException might be thrown (a result that the controller
must translate into something the view understands). The implementation of this
method is another application of the extract method refactoring used previously.
The main point is that the business process behavior is now no longer in the
control layer, it is where it belongs: in the model layer.
Admittedly, this example is quite simple, but consider that the Facade is now
also responsible for providing transaction boundaries. Using a Facade
has a number of benefits including a reduction of duplicated code and reduced
semantic coupling because the control layer doesn't need to know as much about
the model layer. Perhaps even more interesting is that the use of a Facade
provides a very natural migration path to EJB session beans (see [Brown2000]
and [Sun2001]).
One effect of using a Facade is a reduction of coupling from the controller
to the model. However, it is not the controller that we're trying to protect.
The business model and business process are likely the most easily migrated
parts of an application. By introducing a Facade, the business layer
is protected from the specifics of the controller. An alternate control mechanism
can be introduced with little or no change to the business layer.
Migrating the FormHandler into a servlet for IBM WebSphere Application Server
requires pretty extensive change to the control layer, but no modification to
the model layer.
public void doPost(HttpServletRequest request, HttpServletResponse response) {
int sourceId = Integer.parseInt(request.getParameter("sourceId"));
int destinationId = Integer.parseInt(request.getParameter("destinationId"));
double amount = Double.parseDouble(request.getParameter("amount"));
try {
getTeller().transfer(sourceId, destinationId, amount);
getServletContext().getRequestDispatcher
("success.jsp").forward(request, response);
} catch (InsufficientFundsException e) {
getServletContext().getRequestDispatcher
("failure.jsp").forward(request, response);
}
} |
Again, some exception handling is missing. Still, the controller is quite simple,
making the migration process little more than applying some idioms to convert
data and invoke behavior.
Best practice: write and maintain automated tests
Here's an important question: how are you going to know if migration works?
The simple answer is that you have to test. Manual testing can take a long time
and is prone to errors. Manual testing has its place to be sure. Automated tests
are a valuable mechanism for software developers to validate their work on a
very frequent basis.
Automated testing can take many forms. The JUnit testing framework
is a very simple Java-based testing framework that is completely free (under
open-source licensing). "Test-infected" JUnit users develop tests
before developing the code being tested. To the uninitiated, this sounds strange.
But it works. With proper discipline, you are guaranteed to have test code in
place.
If you don't have tests in place now, some effort to provide tests prior to
starting a migration will add a lot of value and save a lot of time. Automated
tests can be used as a mechanism to measure progress. If 90% of your test cases
run on the migrated code, you're very likely close to being 90% complete. More
importantly, tests give you faith that the code runs as expected. Further, tests
can help to put your code in context; if your tests are up-to-date, they document
how to the code is intended to be used.
Conclusion
In this ever-changing world, migration is an important thing to consider. As
we push the envelope on existing standards and delve into proprietary solutions,
we need to consider the impact of those solutions on the future. New standards
and products may arrive to solve these problems. You need to stay in a position
where change has the smallest impact possible on your applications. Code that
is adequately layered survives longer and is more resilient to change.
Layered architectures and abstraction are the best tools you can use to make
inevitable migration easier. It extends beyond this, of course; migration between
competitive application servers is but one form of change that your application
will undergo. Changes to other elements, like databases and middleware, and
to business requirements will occur over the life span of your application.
The amount of effort you spend up front preparing for this inevitable change
will pay off in the long run.
If you think that you don't have time for all this, you're wrong: you don't
have time not to do this. Testing, layering and abstraction bring a lot of the
cost to the front, but over time you save time and money. Easier to understand
code is easier to change, easier to extend, and easier to migrate. All these
things are going to happen, why not be ready for them?
Top of page
Resources
About the author  | |  | Wayne Beaton is a Senior Software Consultant for AIM Services, IBM Software
Group. Wayne's diverse role involves him in lots of interesting stuff from the
WebSphere Skills Transfer and Migration programs to general consulting. Wayne
likes to spend his free time convincing people that Extreme Programming, Refactoring
and Unit Testing actually work. Wayne can be contacted at wbeaton@ca.ibm.com. |
Rate this page
|