In early 1999, the company I worked for was in the process of forming its system architecture. We needed a distributed management system that would let us communicate between a variety of servers seamlessly. Specifically, we needed a failsafe mechanism for adding and removing remote servers dynamically, without disrupting the rest of the system. And, ideally, we wanted our new management system to act as a thin layer over our existing Remote Method Invocation (RMI) classes.
These days, EJB technology would be the most obvious choice on all counts. But in 1999 the technology was still relatively young. It was buggy and very few people had used it to develop business-class systems. Rather than invest our expertise in learning an untried (and rather complex) programming framework, we designed and built a custom interface, which remains the backbone of our enterprise to this day.
The Distributed Services Management environment, or DSM, is a partial alternative to Enterprise JavaBeans technology. Like an EJB framework, DSM handles such issues as scalability, communication, resource management, and failure management, but with far less programming overhead. While not as far-reaching as EJB technology, it is by far a simpler framework to work with. In this article, I'll provide an introduction to DSM. You'll learn about DSM's internal resource management features (such as load balancing and failure management), its architecture, and its programming framework. I'll use code samples to illustrate aspects of programming with DSM, and we'll close with some implementation details from a working DSM environment. You will find the complete DSM class files in the Resources section.
Note that this article assumes you have a good grasp of both distributed application programming and RMI.
The DSM environment provides a simple interface for writing multi-component applications. Application components may reside on different computers within the DSM environment. Each component should be able to be initiated and stopped at any given time without impacting the rest of the system.
DSM provides a single access point to all accessible servers in a distributed environment. Constructed as a thin layer over RMI, DSM manages sets of similar remote (RMI) objects. Like an EJB setup, the DSM environment takes care of some tasks for you, allowing you to concentrate more on application development and business logic, and less on the distributed aspects of the code. DSM tracks server activity and usage, and uses the information to transparently handle the following aspects of resource management:
- Dynamic services management
- Availability and redundancy
- Failure management
- Load balancing
We'll talk about each of these functions separately, then move on to the DSM architecture.
Once you have done the initial servers setup, the DSM environment transparently manages both local and remote servers for you. When you initiate a new server, the server automatically registers itself with DSM (as shown in Listing 1). Once a server has been added to the server pool, its services can be assigned to client requests. Likewise, once a server is shut down, its services can no longer be assigned by the DSM environment. DSM relies on RMI for all remote method requests made by a DSM client.
Because DSM tracks the state and usage of every server in its system and allocates servers based on that information, it offers a high degree of service availability. You can further ensure your services' availability by establishing redundancy in the system. Redundancy means having an idle server that is ready to start work when another one fails.
By default, the DSM environment isn't redundant, because it makes use of all the servers in its pool. If more than one server exists, DSM will balance the servers equally according to the received calls, and if a server fails it simply isn't used. Establishing redundancy would be a matter of creating a custom implementation of the
ServiceParameters policy (discussed later). You would simply establish parameters to keep some servers idle until others had failed.
When a failure occurs in a remote server or the server is unreachable for any reason, the client that received the failed server stub reports it to the DSM environment. DSM keeps track of failed servers and checks them periodically to see if they have been reactivated. Failed servers are marked, and are not returned for client requests.
When an application requests a service and the received server fails, it can request the service again in order to accomplish its task. It's up to you to decide how many times a program will re-attempt to access a service type, and under what circumstances it would give up. This feature is possible because DSM's service management architecture is based on service type, rather than specific server URLs. We'll talk more about this later.
When a failed server in the DSM environment is restarted, DSM notes that it has already been used. It then notifies applications that have used that service type in the past that the server is available again.
By default, DSM balances calls evenly across different servers that implement similar interfaces. You can also apply unique load balancing parameters to enhance DSM's overall service usage. By defining a
ServiceParameters policy, you can change the way the load is balanced among servers in the DSM pool. Parameters such as physical location, machine resources, and resource usage are good criteria for load balancing decisions. When
ServiceParameters are used, the DSM API receives the criteria for the requested server in addition to the server type requested.
DSM abstracts groups of similar servers as a single virtual server. When a program needs to make use of a server, it requests a server of a specific type from the DSM environment. DSM returns a single server's stub from the available server pool and the program can then carry out its call.
DSM defines a
Service object, which represents a group of one or more RMI objects that implement the same interface. The
Service object is a virtual entity until DSM receives a service request. Upon receiving a request, DSM returns one of the RMI objects in its pool. DSM tracks the status and usage of its servers, and uses that information to decide which server object to return.
The DSM architecture consists of the following components:
- A client is a program that wants to use a DSM service. A client may also be a server within the system.
- A server implements a remote interface managed by the DSM mechanism through the
Serviceis a class whose instances are components that are managed by the DSM environment. Each DSM
Serviceis a subclass of the Java
dsm.Serviceclass. A server may also be client within the system.
- The DSM mechanism is a transparent management mechanism that manages all the services in your environment. Like the EJB environment, the DSM mechanism includes resource management features that are transparent to the application programmer.
The DSM mechanism is designed to manage multiple remote servers in a heterogeneous, distributed network. In the sections that follow, you'll get a better idea of how the mechanism actually works, as we delve into some of the client- and server-side aspects of programming with DSM.
For each DSM service, you start by subclassing the basic Java class
dsm.Service. In the
Service subclass, you override the
Service.getServiceTypeName() methods. This enables the DSM environment to distinguish the service type and name from other services. The service name and type must be unique for each type of interface. The remainder of the implementation is like writing an RMI server program, as follows:
- Write a remote interface that implements the
rmi.Remoteclass. Each method in the interface will throw
- Define a class that implements the interface. This class will extend the
Serviceclass instead of the
UnicastRemoteObjectclass in RMI.
- Compile the stub and skeleton with the
Each of the
Server class methods is implemented as it would be in RMI, and each of the implemented methods throws
RemoteException. See Resources to learn more about writing RMI server programs.
ServicePool enables you to call the different services in your server pool.
ServicePool holds all the servers for each of the
Service types for you. A call to
ServicePool.getServer(SERVICE_TYPE) will return an appropriate RMI server stub of the requested type from the DSM environment. You can then call the RMI stub normally. If a failure occurs, you are expected to report the failure using the call
ServicePool.reportFailure(stub). You can then call
ServicePool.getServer(SERVICE_TYPE) again to get another stub from the DSM environment.
Listing 1 shows a client-server interaction for sending and transmitting an e-mail through an e-mail service remotely. Look at the code first and then we'll talk about the details.
Listing 1. A DSM program for sending an e-mail remotely
On the server side,
EmailService implements the
SendMail interface, which in turn implements the
rmi.Remote interface (not shown). The
EmailService class extends the
dsm.Service class of the DSM environment. The two implemented methods,
getServiceTypeName(), distinguish this type of service from others in the environment.
sendMail() method of the remote interface
SendMail is implemented, and
RemoteException is thrown. Finally, the server is initiated by calling the
registerServer() method of the
Service class. The
registerServer() method registers the server implementation with the DSM environment (the constructor isn't shown). Once a
Server of type
EmailService is initiated and registered, it is managed by the DSM environment and can be used by requesting clients.
One such client is shown in the second column of Listing 1. The client uses a
processMailRequest() method to request a server of type
SERVICE_EMAIL from the DSM environment. Once a server is returned by the DSM environment, the client calls it by invoking the
sendMail() method (just like invoking a method of a remote object in RMI). If the call is unsuccessful, the client reports the failure to the DSM environment with the
ServicePool.reportFailure() method. The client can then request another
The DSM mechanism tracks server failure, activity, and usage, using the information to manage resources throughout the DSM environment. Among its resource-management features is load balancing. The load balancing mechanism includes both default and customizable behaviors for balancing server usage.
Servers are grouped according to type. Upon receiving a call for service (the
ServicePool.getServer() method), the default load balancing method returns the least-used server to the requesting client. You can customize the DSM mechanism's load-balancing behavior by extending the class
Listing 2 shows a customized load balancing solution. Have a look at the code and then we'll talk about how it works.
Listing 2. A customized load-balancing solution
In Listing 2, the load-balancing mechanism has been fine-tuned to prioritize factors of location and volume. Public class
ServiceParameters. The set parameters help the DSM mechanism choose among mail servers located in different locations and running on networks with different bandwidth volumes.
In this example, e-mail servers will be prioritized for selection based on e-mail suffix and bandwidth volume (simplified to the number of concurrent e-mails to be sent). Upon receiving a request for e-mail service, the DSM mechanism will first select a server that has the same suffix as the e-mail address requested by the client and a current greater-than-zero bandwidth volume. Servers that do not fit these parameters will be labeled
POSSIBLE_CRITERIA and the search will continue for a better server.
Upon choosing a server, DSM calls the
incrementLoad(). If no possible match is found, the method
loadReset() is called for all the
Servers of that
Service type, and a new search is conducted. Listing 3 shows how you might add
ServiceParameters to the
EmailService program shown in Listing 1.
Listing 3. ServiceParameters for the EmailService program
Note that when the search for
ServiceParameters is conducted, parameters are cached locally and not transmitted each time.
DSM was written as the backbone structure for a highly active telemessaging environment. One of the first considerations for implementing the DSM environment is that it is written over RMI. Therefore, as previously mentioned, the DSM
Service class extends
The DSM environment is also location transparent, which means servers' URL locations are unknown by the calling client. To support this feature,
Server URLs are internally managed in DSM.
Server URLs (which are RMI object URLs) are constructed automatically by the DSM environment at startup. The typical URL structure is as follows:
number parameter is optional; it can be set by DSM in case another object of the same name is detected on the machine.
Servers register their type and URL (and optional parameters) using a mechanism that is similar to the
For the purpose of resource management, the DSM environment keeps track of all server usage and server status. DSM stores this information, along with each
Server's URL and
ServiceParameters, through the
DSMRegistry program, which is much like RMI's
RMIRegistry program. Although the two programs are similar,
DSMRegistry does not replace
RMIRegistry. In the DSM environment, every computer running an RMI server or a DSM server requires a running
RMIRegistry program. Only one instance of
DSMRegistry is required to execute for the entire environment, but running more will yield better resilience and balancing.
At TeleMessage, we first implemented the
DSMRegistry as a stand-alone service, holding all of the servers' information in its memory. Eventually, we upgraded the
DSMRegistry so that its data (the URL, status, and service parameters of every server in the DSM environment) was stored and retrieved through an Oracle database. Having our data stored in a separate database allows us to run several
DSMRegistry services on different machines. This allocation improves overall system performance, and also removes the issue of dependency on one machine.
Note that because not every computer runs
DSMRegistry, the name of a
DSMRegistry server must be placed in a startup file for the initiation process.
We found it useful to separate the TeleMessage project into different areas of programming responsibilities, which we then encapsulated as individual
Services for the DSM architecture to manage. For one example, the database code of our TeleMessage application is encapsulated in a
Service of its own, so it does not mingle with other parts of the code in other services. Unlike the EJB approach, there is no database code within the different parts of the TeleMessage system. Instead, database calls are made through a simple interface to a DSM service (or an RMI service, in
This same type of encapsulation applies to Web and JSP servers, as well as telephony servers, text-to-speech engines, SMS servers, email servers, and more. In each area of programming, the code is handled by the group that is responsible for it, and other groups can access it through a simple, service-oriented interface. Separation of responsibility simplified our overall development process considerably. We gained from the fact that our programmers did not need to master EJB or any other complex environment in order to write the services for our operation.
Persistence (that is, failure management and recovery) is an essential aspect of transaction handling in a distributed environment. The current DSM implementation does not resolve issues of persistency, but leaves it to the application programmer to handle them. In the TeleMessage environment, we use our Database component for persistence. The TeleMessage Database component, as described previously, operates as a service in the DSM environment. In any case where data persistence may be jeopardized, a single method serves to encapsulate the sequence of calls. A single encapsulated call then either entirely succeeds or entirely fails, and partial modified data is restored.
The implementation code for the DSM environment includes three folders, which you will find in the Resources section:
dsm includes the
Service, ServicePool, ServiceParameters, and
dsm.registry includes the
- dsm.impl includes the DSM environment implementation.
The DSM environment provides a simple mechanism to develop and execute multicomponent enterprise applications in a distributed environment. DSM is a simple, lightweight management layer implemented over RMI. While it does not offer all the same functionality as Enterprise JavaBeans technology, the DSM mechanism does transparently handle many aspects of enterprise programming for you. Features of the environment include dynamic addition and removal of components, location independence, load balancing, and system-level failsafe mechanisms.
DSM is a homegrown development environment. It requires some learning overhead, but it's not nearly as demanding as EJB technology in this regard. Because DSM is written over Java and RMI, its programming structure should be relatively familiar to Java programmers. Its simplicity also makes DSM relatively easy to customize and extend as your needs change. It has been my experience that DSM serves exceptionally well as the backbone of the TeleMessage enterprise. If nothing else, DSM shows that alternatives to EJB technology do exist, and that the idea of building your own development environment is not so far-fetched as some might think.
The author would like to thank software engineers Amichai Rothman and Yaniv Kunda of TeleMessage, who have developed some of the key DSM infrastructure implementation.
- Humphrey Sheil's "To EJB, or not to EJB?" (JavaWorld, December 2001) is an even-handed discussion of the pros and cons of programming with Enterprise JavaBeans technology.
- EOB is implemented over Excalibur AltRMI, an interesting RMI alternative from Jakarta.
- Kyle Gabhart's J2EE pathfinder series (developerWorks, 2003) is a good place to learn about how various J2EE technologies compare to EJB technology.
- One of Brett McLaughlin's EJB best practices columns, "EJB best practices: Speed up your RMI transactions with value objects " (developerWorks, September 2002), talks specifically about the problems that come up when you combine RMI and EJB technology -- and also offers a nice workaround.
- To learn more about RMI, start with Sun's Remote Method Invocation homepage.
- Still have questions about DSM? Ask the DSM design and development team!
- You'll find hundreds of articles about every aspect of Java programming in the
developerWorks Java technology zone.
Noam Camiel is Chief Technology Officer and cofounder of TeleMessage, a universal messaging services company. From 1996 till 1999, Noam served in the elite technical unit of the Intelligence Corps in the Israel Defense Forces as a senior project manager in the major R&D group. Prior to this, he worked as a computer programmer in the same unit. Noam gained his MSc degree in computer sciences from the Hebrew University in Jerusalem, with his thesis on shared objects in Java. During his degree, Noam was a part of the POPCORN project for distributing computations through the Internet using the Web and Java. The project was exhibited at the WWW6 conference in Santa Clara, CA, in 1997, and at the "Mascots conference" in Amsterdam, Holland, in May 1998. Noam also holds a BSc in Computer Science and Physics from the Hebrew University, Jerusalem.