Autonomic computing: a quick recap
In a nutshell, administrators spend too much time doing repetitive tasks that the system should be able to do for itself. For example, when Michelle in accounting installs a new bookkeeping client, she shouldn't have to learn about operating systems, run time environments, and how to check how much drive space and RAM she has before she can get it installed. The installer should be able to check for her and, preferably, resolve any problems before continuing.
Additionally, if a database's log file is about to fill up its drive, the system should be able to provision a new one without the administrator having to keep an eye on it and doing it manually. When an international consortium of hackers targets your server for attack, it should be able to defend itself without completely shutting down.
The autonomic computing field is still evolving and developing, of course, but in many of these cases, the tools are out there, ready for you to use to make your applications more autonomic. This article discusses what you can use today to make your applications and systems more self-sufficient.
Whenever a new technology comes on the scene, there is a period of suffering during which a lot of people are thinking about it or writing about it, but nobody quite knows how to actually do it yet. And the more linexciting the technology, the more people are talking about it. The autonomic computing environment is pretty exciting, with its promise of systems that can take much of the routine daily load off administrators; therefore, there's a lot of talk. It's also a little intimidating if all you know is the ultimate goal.
The good news is that it doesn't have to be intimidating, either because it sounds too complicated or because it sounds too powerful. Fortunately, it's not about building a system that's going to become sentient and take over the world. Well, not in the near future, anyway.
No, instead I'm talking about taking the first baby steps into a world in which systems perform their functions -- and only their functions -- in concert with business goals. That's a crucial aspect of the autonomic computing environment, so let me restate it: we're not talking about changing your business rules -- just the way your business implements those rules.
The tools within the Autonomic Computing Toolkit (see Resources) are the first steps toward bringing to you an architecture that enables system components to not only think, but also to converse with each other in order to think better.
An autonomic computing architecture consists of several core capabilities:
- Solution installation and deployment technologies
- Integrated Solutions Console
- Problem determination
- Autonomic management
- Provisioning and orchestration
- Complex analysis
- Policy-based management
- Heterogeneous workload management
I'll discuss each of these goals, and show you how you can start working towards them.
Solution installation and deployment technologies
When was the last time you installed a major software package? How long did it take you to figure out whether you'd met the hardware requirements? How about prerequisite software? Did you have to download additional packages and install them? Did your installation break software that had already been installed? How long did it take you to clean up the mess?
Wouldn't it be nice if you could just start the installation and trust that the installer would figure out where to put the software, what prerequisites to install beforehand, and how not to break any existing installations?
That's the general idea behind the autonomic computing initiative's solution installation and deployment concept.
In general, it works like this. You decide to install a software package. The software consists of the actual code, or artifact, along with a descriptor file that explains what's in it and what prerequisites need to be fulfilled before it can be installed. This descriptor/artifact package is called an Installable Unit, and these units can be grouped into solution modules, as shown in Figure 1.
Figure 1. Solution modules
In this case, the overall solution module includes a second solution module and an independent Installable Unit.
Before installing the software, the installer reads the descriptor and checks a database of previously installed software and hardware to determine whether all of the prerequisites have been met. If they have, the software gets installed, and its information is added to the database. If not, depending on the level of autonomic sophistication, the installer will either warn the administrator or correct the problem by downloading and installing the required software, provisioning additional hardware, choosing an alternate server, and so on.
This can be accomplished through the solution installation and deployment technology tools demonstrated in the Autonomic Computing Toolkit. IBM has partnered with two of the leading providers of software installation products, InstallShield and ZeroG, to integrate these autonomic technologies into their products. The Autonomic Computing Toolkit also includes practical scenarios that demonstrate how products from InstallShield and ZeroG use these tools to automate the software installation process.
After the software is installed, it must be managed. In an ideal autonomic computing system, all applications will be managed from a single vantage point, giving you two advantages. First, administrators won't have to learn a variety of tools, and second, because all of the information is within a single environment it can ultimately be managed by a single system. Later, I'll show how this provides you with the ability to perform policy-based management.
The Autonomic Computing Toolkit already has such an environment. The Integrated Solutions Console is a browser-based environment, enabling administrators to handle all of their administration from one place. Integrated Solutions Console development is based on the Portal Toolkit for WebSphere® Studio, so administration functions are handled through portlets, or components, within a single system. When an administrator adds new software, its administration functions -- and help files -- are added to the common administrative system.
Now, the whole idea of an autonomic computing architecture is to create a system that's self-configuring, self-healing, self-optimizing, and self-protecting. In order to do that, the system needs to be able to recognize problems, determine their cause, and take the appropriate action to correct the problem. The most logical way to do that would be through the use of logs.
If you're an application developer, you know what logs are typically used for: tracking back to a problem if the user finds one. Most of the time, they're not meant for general consumption. In fact, in many cases, application logs are meant for just one person -- the application developer. The application developer knows that if the system is shutting down unexpectedly, he or she is looking for an event that says "unexpected termination." Or is it "unexpected quit"? Or "Help! The monkey's running loose with a pointed stick!"
The point is that in order for an automated system to be able to use log files in determining a problem, logs must have a common format.
That format is the Common Base Events format, an XML-based vocabulary. Common Base Event V1.0.1 defines eleven situation categories -- StartSituation, StopSituation, ConnectSituation, ConfigureSituation, RequestSituation, FeatureSituation, DependencySituation, CreateSituation, DestroySituation, ReportSituation, AvailableSituation -- and provides an OtherSituation category to support product specific requirements. If an application outputs events in this format, an autonomic computing system can use that information to determine when and if there's a problem and what to do about it by correlating events into situations. For example, one possible (and highly simplified) situation could be:
Listing 1. Simplified example
Application server can't connect to the database + Application server can ping database server machine = Database is down |
The system can then consult a symptom database to determine that if the database is down, it should attempt to restart it and notify an administrator that a problem has occurred. (The symptom database can also play a part in correlating events to situations.)
But what if you already have an application and it doesn't output events in Common Base Events format? Does that mean you can't integrate it into an autonomic computing system? In this case, you have two choices: you can either change the application, or you can use the Generic Log Adapter, which converts legacy-based events into the Common Base Events format. The Adapter Configuration Editor tool, part of the Autonomic Computing Toolkit, integrates with WebSphere Studio Application Developer or the Eclipse platform and enables you to create rules the Generic Log Adapter can use to convert your logs to the format that an autonomic computing system can understand.
The Autonomic Computing Toolkit also includes another Eclipse-based tool, the Log and Trace Analyzer. This tool provides a graphical interface that you can use to view events from the logs of different applications. If these events are in the Common Base Events format, you can even see a correlated view of the events and determine the sequence of occurrence of these events -- a welcome improvement for system developers and support staff.
For an autonomic computing system to discover and control events and situations, it uses a control loop that constantly monitors the system looking for events to handle. This control loop is defined by the autonomic computing reference architecture, as shown in Figure 2:
Figure 2. Control loop
The Control loop is the system by which events can be detected and dealt with. The process involves four steps:
- Monitor: First, the system looks for the events, detected by the sensor from whatever source -- be it a log file or an in-memory process. The system uses the knowledge base to understand what it's looking at.
- Analyze: When an event occurs, the knowledge base contains information that helps to determine what to do about it.
- Plan: After the event is detected and analyzed, the system needs to determine what to do about it using the knowledge base. The symptom database might have information, or a central policy server might determine the action to take.
- Execute: When the plan has been formulated, it's the effector that actually carries out the action, as specified in the existing knowledge base.
Although the control loop is a single conceptual process, it doesn't have to be carried out by a single product. For example, IBM® Director and Toshiba ClusterPerfect products can share the control loop, with Director carrying out the monitor and analyze processes and ClusterPerfect carrying out the plan and execute steps.
The Autonomic Computing Toolkit includes the Autonomic Management Engine (AME), which can perform all of these steps for you. Of course, AME needs to be able to communicate with your product, which isn't as complicated as it sounds. AME can communicate with any product as long as there is an appropriate Resource Model in place. The Resource Model tells AME what it's looking for, be it a log entry or the status of a particular process.
You can create your own Resource Model using the Autonomic Computing Toolkit's Resource Model Builder Tool, which can be integrated into WebSphere Studio Application Developer or the Eclipse IDE.
Provisioning and orchestration
Even now, while the autonomic computing architecture is still in its early stages of development, it's obvious that one of the areas in which it would be most useful is in terms of provisioning -- whether it's a new drive because the old one died or a massive rollout of new machines to employees. A fully autonomic computing system will be able to predict when new capacities need to be provisioned (or alternately, deallocated) and would do so automatically.
In reality, these are two separate problems. The first is the idea of provisioning itself. Right now, in order to add capacity using machines that are already in place, an administrator needs to determine the actions to take, and then physically perform an act such as mapping a new drive, or worse, physically manipulating hardware. Some situations might even require moving servers from one place to another. Even less physically demanding tasks such as installing operating systems and software are rife with opportunity for human error as systems become more complex.
With an automated provisioning tool (such as IBM Tivoli® Provisioning Manager) an administrator can automate this process by coalescing his or her knowledge into a well defined, repeatable set of actions that the software can take when needed.
But what about automating that even further? What if the administrator could simply define actions to take and trust that the system would take them as needed, without human intervention? In that case, you need a product that provides orchestration. Basically, an orchestration product (such as IBM Tivoli Intelligent ThinkDynamic Orchestrator) monitors the overall system and uses autonomic computing concepts to determine when an action needs to be taken. It then takes appropriate action as defined by the administrator or per business policy.
As you might have surmised by now, some of this processing can involve some fairly complex logic. In the case of a Java technology implementation (such as that provided by the Autonomic Computing Toolkit), you might opt to use JavaBeans components that provide artificial intelligence-like capabilities.
Still in its early stages, the Autonomic Computing Toolkit will ultimately include software that provides these capabilities. To get an idea of how this might eventually work, it's helpful to look at an emerging technology tool, the Agent Building and Learning Environment (ABLE) 2.0, currently available on IBM alphaWorks (see Resources). ABLE provides AbleBeans such as Data Beans to read and write data to and from various sources, Learning Beans that implement reasoning such as decision trees and Bayesian reasoning, and Rules Beans such as forward chaining and fuzzy logic beans.
AbleBeans can be combined into AbleAgents using rules. For example, ABLE 2.0 comes with several agents, including a neural classifier that uses back propagation to classify data, and a rule agent that contains a rule set with rule blocks to define its init, process, and timer actions.
With all of this reasoning going on, it seems like things could easily get out of hand, but just the opposite is true. All decisions are made according to set policies, and in a fully autonomic computing system, those policies are coordinated across the system. For example, you might use Tivoli Access Manager to control access to your resources. If all access decisions are made using this knowledge base, then you can set a business rule from a single location.
What's more, that business rule can be verified in one place. This ease of management is good not just for administrators, but also for stakeholders such as CEOs who need to sign off on policies for shareholders and regulators.
Heterogeneous workload management
All right, you've gotten this far. You've installed your applications into the system using Installable Units, you're managing it from the Integrated Solutions Console, and you're monitoring and resolving problems by using the Autonomic Management Engine. Is that it?
Well, not quite. The ultimate goal of autonomic computing systems is the system in which everything is tracked, from start to finish, and constantly optimized for better performance above and beyond resolving problems. In short, it's business workload management.
Products such as the Enterprise Workload Manager (EWLM) component of the IBM Virtualization Engine provide hetergeneous workload management capabilities. They enable you to automatically monitor and manage multi-tiered, distributed, heterogeneous or homogeneous workloads across an IT infrastructure to better achieve defined business goals for end-user services. These capabilities allow you to identify work requests based on service class definitions, track performance of those requests across server and subsystem boundaries, and manage the underlying physical and network resources to set specified performance goals for each service class.
To take full advantage of workload management capabilities provided by products such as EWLM, applications need to be able to provide performance information in the Application Response Measurement (ARM) standard format. The IBM® Software Development Kit (SDK) for EWLM is a tool to aid in the development and test of ARM 4.0-level instrumentation in applications and middleware, and the development and test of EWLM ARM Adapter library implementations.
The tools and technologies that enable systems to become self-configuring, self-healing, self-optimizing, and self-protecting are ready for you to use. Download the IBM Autonomic Computing Toolkit and start creating applications that will use the autonomic computing core capabilities. Some capabilities, such as solution installation, integrated solutions console, problem determination and autonomic management are part of the Autonomic Computing Toolkit itself. Others, such as complex analysis, policy-based management, and heterogeneous workload management require additional software.
But, however you look at it, and however you build it, it's not hype anymore.
- Download the Autonomic Computing Toolkit, which includes the Solution Installation and Deployment Technologies, Integrated Solutions Console, Log/Trace Analyzer, Adapter Rule Builder, Autonomic Management Engine, and Resource Model Builder tools.
- Refer to EWLM documentation available in the online IBM® eServer™ Information Center, under the Virtualization Engine™ product topic. The Information Center provides information about EWLM ARM Adapter API implementation specifications, and a downloadable PDF file containing the IBM® eServer™ ARM 4.0 Application Instrumentation Guide.
- Download the Agent Building and Learning Environment 2.0 for building intelligent agents.
- Download the Emerging Technologies Toolkit for a look at upcoming tools and technologies.
- Check out more emerging tools on AlphaWorks.
Nicholas Chase, a Studio B author, has been involved in Web site development for companies such as Lucent Technologies, Sun Microsystems, Oracle, and the Tampa Bay Buccaneers. Nick has been a high school physics teacher, a low-level radioactive waste facility manager, an online science fiction magazine editor, a multimedia engineer, and an Oracle instructor. More recently, he was the Chief Technology Officer of an interactive communications company in Clearwater, Florida, and is the author of five books, including XML Primer Plus (Sams). He loves to hear from readers and can be reached at nicholas@nicholaschase.com.
Comments (Undergoing maintenance)





