Normally, whenever an abnormal event occurs in any Java server during a transaction, the server logs the exception or error and aborts the current transaction. The server isn't capable of dynamically managing the problems and then continuing on to handle new transactions. Human intervention is an unavoidable scenario here. Unless the problem is analyzed and fixed by the administrator, the system continues to behave this way when the problem reoccurs; this approach involves some time, which might be very costly for critical businesses. Additionally, with human intervention there is always the chance of introducing errors.
The problem in this scenario has a substantial impact over how the application is deployed and run on the server. The amount of impact depends on the vitality of exceptions and errors that occurred in the underlying server. A few instances where the problem might occur are:
- ClassNotFoundException because a referred class in the classpath doesn't exist.
- FileNotFoundException because a particular file at the specified path or directory doesn't exist.
- OutOfMemoryError because of insufficient memory to execute a transaction.
For any of these exceptions or errors, the server throws the exception and logs it. The normal flow of the server when an exception occurs is:
- The server propagates the exception or error to the Java Virtual Machine (JVM).
- It traces the information (which might not be complete) in the log file, if configured to do so.
Note that even though the same exception occurs because of the same problem (ClassNotFoundException because of a particular class missing), the server follows the same flow.
Today, the aim is to provide an automated solution as much as possible.
This article's suggested methodology makes a Java-based server a non-stop server by taking a few corrective steps in between the normal flow. These steps are:
- The server throws an exception or error and logs it.
- A self-healing module of the proposed autonomic computing system gathers the exception information by parsing the log file.
- An analyzer module analyzes the cause and decides the necessary action to be taken using the predefined policies.
- The healing module takes the appropriate action.
Figure 1 shows the flow in both a non-autonomic and autonomic computing environment.
Figure 1. Flows in non-autonomic and autonomic computing environment
The proposed Model for Self-Managing Java Server , which is an autonomic computing model for self-managing Java servers, has the following components:
- Used to analyze the cause of the problem. The analyzer monitors for a particular destination (LogFile, Console, and so on) for any changes and analyzes the cause. For example, LogAnalyzer monitors the log file and analyzes the cause, and DynamicAnalyzer analyzes from exceptions thrown by the JVM.
- Reads the rules from the policy file and gives instructions to HealingManager to decide which Healer should be delegated for a particular operation. Healers will also interact with this module to decide on the action taken as part of healing process.
- Manages the overall function of the model. It coordinates the whole process by interacting with various other components -- PolicyManager, different Analyzers, and Healers.
- Cater to different problems. For example, the ClassNotFoundHealer fixes the ClassNotFoundException, and the FileNotFoundHealer fixes the FileNotFoundException.
- A predefined set of rules to fix the problem. They can be an XML file.
As mentioned, an autonomic computing model has four characteristics -- self configuration, self healing, self optimization, and self protection. The Self-Managing Java Server model concentrates on self configuration and self healing. Now, we'll tie the different components of the Self-Managing Java Server to the components of an autonomic computing system.
According to autonomic computing architecture, implementing self-managing attributes involves an intelligent control loop. This loop collects information from the system, makes decisions, then adjusts the system as necessary. The control loops can be organized into two major elements:
- Managed element
- What the autonomic manager is controlling.
- Autonomic manager
- A component that implements a particular control loop.
Figure 2 shows the main components of autonomic computing.
Figure 2. Components of autonomic computing
Analyzer, PolicyManager, HealingManager, and the Healers together constitute the Autonomic Manager. Policies in a file are the Knowledge base. Any Java server is the Managed Element that is being managed by the Autonomic Manager.
Analyzer monitors the log file for any exceptions, analyzes the exception using the Policies (Knowledge), then delegates the exception to the HealingManager for planning. The HealingManager gets the input from PolicyManager, and decides which Healer to call to take corrective action for the problem. The delegated Healer executes the corrective action for the problem.
The Self-Managing Java Server model normally works using the following steps:
- Analyzer is configured to poll a target medium for exceptions or errors. Once an exception or error occurs, it analyzes the cause and intimates it to the HealingManager.
- From the information it gets from the Analyzer, the HealingManager uses the PolicyManager to decide which Healer to delegate the problem to.
- Once the control is delegated to a particular Healer, it receives the rules from the PolicyManager and fixes the problem.
- The entire process is logged into a file for reference.
The demo tracks ClassNotFoundException, takes corrective action by searching the required class file in the predefined directory, and places the corresponding JAR file in the CLASSPATH.
- The demo uses the Tomcat server as the Java server that needs to be monitored for exceptions.
- A sample servlet (periServlet) tries to load the class (
com.ibm.autonomic.resources.ToBeLoaded) that is not in the lib directory of the Tomcat server, thereby creating a ClassNotFoundException.
- This exception is logged in the SMSDemo.log file.
- The SMSDemo.log file is being monitored by the LogAnalyzer at a predefined interval (default of 3000 milliseconds).
- The LogAnalyzer greps the exception type (ClassNotFoundException) and name (
com.ibm.autonomic.resources.ToBeLoaded), and hands it to the HealingManager.
- The HealingManager determines that the current exception should be passed on to the ClassNotFoundHealer by going through the policies in the healing.delegation.policy file. Policies are only a property file. A sample policy is shown below.
# Policy file for HealingManager to decide on delegation to a Healer java.lang.ClassNotFoundException= com.ibm.autonomic.healers.ClassNotFoundHealer java.lang.NoClassDefFoundError= com.ibm.autonomic.healers.ClassNotFoundHealer
- The ClassNotFoundHealer goes through its policies and searches for the required class file by recursively going through all directories for the JAR file in the predefined location. If found, it will copy the JAR file in the Tomcat lib (%tomcat_home%\common\lib) directory and restart the Tomcat server.
A sample Healer policy file is show below.
# ClassNotFoundHealer specific policies # From where to search recursively for the class file missing (within a JAR) SEARCH_LOCATION=C:\\SMSHome\\build # Place to copy the JAR file so that the server can add it to its classpath COPY_LOCATION=C:\\jakarta-tomcat-4.1.27\\common\\lib
- When the periServlet tries to load the class (
com.ibm.autonomic.resources.ToBeLoaded) again, no exception is generated because the JAR file is copied into the CLASSPATH.
This approach provides 99% of a solution for the ClassNotFoundException problem, because only the first transaction fails. Further research could be aimed at achieving a 100% problem-free solution.
Figure 3 explains the complete message flow:
Figure 3. Message flow
The numbers in the following list correspond to the numbers in Figure 3.
- Log Analyzer keeps monitoring the LogFile for any exceptions.
- Once an exception occurs, it delegates the exception handling to the HealingManager.
- HealingManager sends a request to the PolicyManager to determine which healer the exception has to be routed to.
- The PolicyManager in turn reads the pre-defined policies for the exception from the policy file and determines the policy.
- PolicyManager forwards the policy result to the HealingManager.
- HealingManager determines which Healer the problem has to be delegated to from the policy result and delegates the same.
- The Healer performs the corrective action. It talks to the PolicyManager and obtains the policies, if any, to complete the corrective action.
The paper discussed how errors are handled in current Java servers that require some type of human intervention. It provides a description of how autonomic computing methodology can be used to make the Java servers self-managing. Finally, the paper discussed a demo on tracking ClassNotFoundException in the Tomcatserver and how it takes corrective action without any human intervention.
- See the working demo of this paper on alphaWorks.
- Read AUTONOMIC COMPUTING: IBM's Perspective on the State of Information Technology for more background on autonomic computing in IBM.