Previous installments of The autonomic computing edge described the adoption of autonomic computing in various areas: Japan, standards bodies, and academia. This column is the first in a series that will examine various aspects of the autonomic computing architecture, including the architectural building blocks (autonomic managers, touchpoints, knowledge, and others) and other architectural considerations. As always, The autonomic computing edge takes you to the edge of what is happening in the realm of autonomic computing with a mix of facts and opinions.
This month, I begin an autonomic computing architecture series by discussing the self-CHOP aspects of the architecture. These attributes offer a reasonable way to describe a self-managing system, but they are not mutually exclusive.
I'll describe the self-CHOP concept, explain why the CHOP traits are not entirely independent, and offer some examples that illustrate integration of the CHOP concepts.
What is "Self-CHOP?"
The acronym CHOP is shorthand for configure, heal, optimize, and protect, the fundamental aspects of autonomic computing technology. Autonomic systems are designed to address one or more of these aspects. Figure 1 illustrates these concepts.
Figure 1. Self-CHOP
An architectural blueprint for autonomic computing (see Resources) explains the self-CHOP attributes:
Although [autonomic] control loops consist of the same fundamental parts, their functions can be divided into four broad embedded control loop categories. These categories are considered to be attributes of the system components and are defined as:
- Self-configuring - Can dynamically adapt to changing environments. Self-configuring components adapt dynamically to changes in the environment, using policies provided by the IT professional. Such changes could include the deployment of new components or the removal of existing ones, or dramatic changes in the system characteristics. Dynamic adaptation helps ensure continuous strength and productivity of the IT infrastructure, resulting in business growth and flexibility.
- Self-healing - Can discover, diagnose and react to disruptions. Self-healing components can detect system malfunctions and initiate policy-based corrective action without disrupting the IT environment. Corrective action could involve a product altering its own state or effecting changes in other components in the environment. The IT system as a whole becomes more resilient because day-to-day operations are less likely to fail.
- Self-optimizing - Can monitor and tune resources automatically. Self-optimizing components can tune themselves to meet end-user or business needs. The tuning actions could mean reallocating resources -- such as in response to dynamically changing workloads -- to improve overall utilization, or ensuring that particular business transactions can be completed in a timely fashion. Self-optimization helps provide a high standard of service for both the system's end users and a business's customers.
Without self-optimizing functions, there is no easy way to divert excess server capacity to lower priority work when an application does not fully use its assigned computing resources. In such cases, customers must buy and maintain a separate infrastructure for each application to meet that application's most demanding computing needs.
- Self-protecting - Can anticipate, detect, identify and protect against threats from anywhere. Self-protecting components can detect hostile behaviors as they occur and take corrective actions to make themselves less vulnerable. The hostile behaviors can include unauthorized access and use, virus infection and proliferation, and denial-of-service attacks. Self-protecting capabilities allow businesses to consistently enforce security and privacy policies.
When system components have these attributes, it is possible to automate the tasks that IT professionals must perform today to configure, heal, optimize and protect the IT infrastructure.
It is useful to consider a self-managing autonomic system in terms of these self-CHOP attributes, and autonomic managers can be constructed to perform the various self-CHOP functions. At the same time, though, the self-CHOP functions are not always as distinct as the boundaries in Figure 1 imply, and they may overlap, as described next.
As described earlier, a self-healing autonomic manager can detect disruptions in a system and perform corrective actions to alleviate problems. One form that those corrective actions might take is a set of operations that reconfigure the resource that the autonomic manager is managing. For example, the autonomic manager might alter the resource's maximum stack size to correct a problem that is caused by erroneous memory utilization. In this respect, the self-healing autonomic manager might be considered to be performing self-configuration functions by reconfiguring the resource to accomplish the desired corrective action.
Similarly, a self-optimizing autonomic manager, as described earlier, might reallocate resources to adapt to changing workloads to improve resource utilization. These self-optimization changes, too, might be accomplished by reconfiguring the resources that the autonomic manager is managing. For example, the self-optimizing autonomic manager might reconfigure server clusters by adding servers to or removing servers from the cluster.
As you can see, self-healing and self-optimizing management could involve self-configuration functions (so, too, could self-protection). Indeed, it often may be the case that actions associated with healing, optimizing, or protecting IT resources are performed by configuration operations. Although self-configuration itself is a broader topic that includes dynamic adaptation to changing environments, perhaps involving adding or removing system components, self-configuration is also fundamental for realizing many self-CHOP functions.
The self-CHOP functions can be interrelated in other ways, too. Consider these examples (and you probably can think of others):
- A self-protecting autonomic manager detects a security exposure and takes actions to eliminate that exposure. This certainly could be considered to be a form of correcting a problem in the system, and, therefore, might be viewed as a self-healing action in addition to a self-protecting one.
- Overall business resiliency could be improved with a combination of self-healing and self-optimizing management. Blending the operations that correct system disruptions with those that balance workload can result in a more highly available system.
Let's explore some scenarios that further illustrate the integration of the CHOP self-management disciplines.
Integrated self-CHOP scenarios
As described in An architectural blueprint for autonomic computing and illustrated in Figure 2, an autonomic manager incorporates functions of monitor, analyze, plan, and execute.
Figure 2. Autonomic manager details
Figure 2 illustrates that an autonomic manager might include only some of the four control loop functions. Consider two such partial autonomic managers: a self-healing partial autonomic manager that performs monitor and analyze functions, and a self-configuring partial autonomic manager that performs plan and execute functions, as depicted in Figure 3.
Figure 3. Integrating self-healing and self-configuring autonomic management functions
The first autonomic manager could monitor data from managed resources and correlate that data to produce a symptom; the symptom in turn is analyzed, and the autonomic manager determines that some change to the managed resource is required. This desired change is captured in the form of Change Request knowledge. The change request is passed to the self-configuring partial autonomic manager that performs the plan function to produce a change plan that is then carried out by the execute function. This scenario details the integration of self-healing and self-configuring autonomic management functions that was introduced earlier.
A similar scenario can be devised to demonstrate the integration of self-protection and self-configuration. Suppose that the first autonomic manager in Figure 3 is a self-protecting autonomic manager, rather than a self-healing one. Such an autonomic manager might perform monitor and analyze functions to determine that a security exposure exists and that a patch is available to eliminate the exposure. The self-protecting autonomic manager could generate a change request that specifies the installation of the security patch that is then passed to the self-configuring autonomic manager (exemplified by IBM's Solution Installation for Autonomic Computing technology; see Resources), which, in turn, can generate and carry out the change plan to install the required security patch.
Finally, consider a scenario that illustrates that the particular self-CHOP aspect of a self-managing autonomic system might be "in the eye of the beholder." IBM's Common Base Event format (the basis for the Web Services Distributed Management (WSDM) Event Format that is incorporated in the OASIS WSDM 1.0 specifications; see Resources) allows situations encountered by managed resources to be expressed in a common format. This situation information can be sent as events to autonomic managers; these events can be received and correlated in the autonomic manager monitor function.
One of the situations that can be represented in a Common Base Event is a connect situation, including its success disposition (connection was successful or unsuccessful). Suppose that a managed resource generated a rapid series of CONNECT_UNSUCCESSFUL events. To a self-healing autonomic manager, this might be symptomatic of a problem in connecting to a resource (say, a database), perhaps indicating that the resource is unavailable and needs to be restarted. To a self-protecting autonomic manager, though, such a series of events might be symptomatic of a barrage of unauthorized access attempts (perhaps the onset of a denial-of-service attack). Of course, the actual circumstances could be either or neither of these cases, and other available information could be correlated with the event situation information (perhaps by an orchestrating autonomic manager that orchestrates the self-healing and self-protecting autonomic managers) to determine what is actually occurring and what actions might need to be taken. The point, though, is that monitored data can be used for multiple self-CHOP aspects of the self-managing autonomic system.
Self-CHOP describes important attributes of a self-managing autonomic system. Self-CHOP is a useful way to characterize the aspects of autonomic computing, but the four disciplines should not be considered in isolation. Instead, a more integrated approach to self-CHOP, such as this article describes, offers a more holistic view of self-managing autonomic systems.
Future articles in this series will explore other facets of the autonomic computing architecture that build on the concepts of the self-CHOP foundation.
Thanks to my IBM colleague Jim Crosskey, who manages the Autonomic Computing Product Planning team. Special thanks to my IBM colleagues Christine Draper, Thomas Studwell and Jim Whitmore for their ideas and assistance in developing this article, but moreover, for their work in developing the architecture that is worth writing about -- for they are the lead architects for self-configuring, self-optimizing, and self-protecting, respectively. My role as self-healing lead architect lets us work together on the CHOP concepts.
- The paper An architectural blueprint for autonomic computing (IBM Corporation, October 2004) provides an overview of the autonomic computing architecture, including more information about the self-CHOP concepts.
- Read more about autonomic computing in the Volume 46, Number 3, 2007 issue of the IBM Systems Journal.