OASIS has just approved a new standard from the Web Services Distributed Management Technical Committee (WSDM TC) as the first step toward solving the management integration problem. OASIS has approved and published two sets of specifications: Web Services Distributed Management: Management Using Web Services (MUWS) and Web Services Distributed Management: Management of Web Services (see Resources). For a high-level article about WSDM, see the developerWorks article "A little wisdom about WSDM."
The standardization of WSDM 1.0 is an important milestone for autonomic computing technology. To understand why, you need to look at the fundamental goals of autonomic computing. From the beginning, IBM's Autonomic Computing initiative recognized that autonomic computing could not rely on being a proprietary offering. The value of autonomic computing will be fully realized when autonomic managers are able to bring self-managing capabilities to much of the equipment and software in an Enterprise IT infrastructure. The paper, An Architectural Blueprint for Autonomic Computing states, "Autonomic computing systems require autonomic managers to be deployed across the IT infrastructure, managing various resources (including other autonomic managers) from a diverse range of suppliers. Therefore, these systems must be based on open industry standards.â
Open standards that address the manageability capabilities of today's IT resources are essential to successfully deploying autonomic computing technologies. The WSDM standard is important in several respects:
- First, every leading system management software supplier participated in this committee, along with many vendors of middleware, operating systems, and hardware, assuring broad industry support. This is critical for any standard to be useful.
- Second, this standard provides a necessary management interface to a technology (Web services) that is vital to today's business. Businesses are realizing the benefit of Web services in business-critical applications, and this standard provides the way to use system management tools with those critical applications.
- Finally, it allows system management platforms to take advantage of the tremendous power offered by the Service-Oriented Architecture (SOA) of Web services. The article "Management Using Web Services -- A proposed architecture and roadmap" explains how SOA technologies will significantly improve the ease in which management technologies, especially from multiple suppliers, can be integrated, just as Web services has proven for business applications. This is a very important benefit for autonomic computing architecture. With management interfaces to IT resources being exposed through Web services, autonomic managers can use standardized descriptors to understand, monitor, and interact with the management functions of those resources, without the need to have been custom designed to handle specific ranges of resources.
The value of WSDM is further enhanced through contributions, by IBM and others, of new technologies that are cornerstones of autonomic computing. Key technologies such as IBM's Common Base Event, which was used as the basis for the WSDM Event Format, can improve the ability to correlate events from multiple resources, thus permitting improved turnaround and accuracy of problem diagnosis in complex computing environments.
While the WSDM standard as an initial foundation provides substantial progress toward fully autonomic computing environments, additional specifications in support of autonomic environments will emerge over time. Although Web services is not the only technology on which autonomic computing platforms can be built, the advantages of using Web services certainly improve the ease in which autonomic computing can be implemented and integrated. From the Autonomic Computing Blueprint:
This architecture does not prescribe a particular management protocol or instrumentation technology because the architecture needs to work with the various computing technologies and standards that exist in the industry today -- SNMP, Java Management Extensions (JMX), Distributed Management Task Force, Inc. (DMTF) -- as well as future technologies.
Given the diversity of these management technologies that already exist in the IT industry, this architecture endorses Web services techniques for sensors and effectors. These techniques encourage implementers to leverage existing approaches and support multiple binding and marshalling techniques.
The autonomic computing architecture, as described in the Autonomic Computing Blueprint, consists of one or more control loops that dynamically manage various aspects of a computing infrastructure. The acronym MAPE is sometimes used to describe the autonomic control loop because of its four basic elements: Monitor, Analyze, Plan, and Execute. The autonomic control loop, with associated knowledge about the system, its policies, and internal algorithms for managing resources, is defined as an autonomic manager.
The autonomic manager performs a set of self-managing tasks based on business goals established by the business. The tasks may be very broad or may be a very narrow set of management capabilities, but all are based on the requirements of the resources being managed by the autonomic manager. For example, there might be autonomic managers that are dedicated to performing self-healing functions. These autonomic managers must monitor the health of the resources they manage, analyze existing conditions, and apply changes to those resources if conditions warrant.
Figure 1 shows the idealized view of the autonomic manager and its manageability interface to the resources.
Figure 1. Autonomic manager and manageability interface
The implementation of the manageability interfaces on resources managed by the autonomic manager is called a touchpoint. The architecture of a touchpoint is defined by the Autonomic Computing Architecture team and may be implemented in a variety of ways that are appropriate for the managed resource.
The Autonomic Computing Blueprint provides this description of a touchpoint:
A touchpoint is an autonomic computing system building block that implements sensor and effector behavior for one or more of a managed resource's manageability mechanisms. It also provides a standard manageability interface. Deployed manageable resources are accessed and controlled through these manageability interfaces. Manageability interfaces employ mechanisms such as log files, events, commands, application programming interfaces (APIs) and configuration files. These mechanisms provide various ways to gather details about and change the behavior of the managed resources. In the context of this blueprint, the mechanisms used to gather details are aggregated into a sensor for the managed resource and the mechanisms used to change the behavior of the managed resources are aggregated into an effector for the resource.
Owing to the benefits of Web services, the preferred implementation of a touchpoint is to expose it as one or more Web service interfaces. In this case, the touchpoint should be WSDM compliant as a WSDM Manageable Resource.
As you'll see in more detail shortly, WSDM does not normatively define all the capabilities that are required elements of an autonomic computing touchpoint. For example, the IBM Common Base Event requires elements that do not have a corresponding element in WSDM's Event Format. Fortunately, the extensible nature of the WSDM interface permits touchpoint implementers to provide these capabilities, through WSDM-compliant extensions, but the implementation of the extensions may not be standardized. Because the use of the extensions is not normatively defined, interoperability can be achieved only through agreement among stakeholders in the relevant resource/manager relationships. IBM is working closely with other industry leaders to agree on appropriate usage of these WSDM extensions and possible standardization of those extensions so that customers can reap maximum benefit from this important standard.
As these topics develop in the industry, WSDM and the manageability interfaces to enable autonomic computing will evolve. It is understandable that WSDM 1.0 does not define all you need to implement a touchpoint; however, it certainly provides a solid base and extension points on which an autonomic computer-compliant touchpoint can be built. The remainder of this document describes how this can be done today.
One important point regarding touchpoints is that, in today's complex world, resources are rarely simple, single-function entities. A database server, for example, represents an aggregate of multiple related resources such as multiple databases, database table spaces, database tables, and so on. A Web application server would, in turn, "host" a number of Web applications being managed by that server. This view, while very real, is beyond the scope of this article. Therefore, we focus on the simple case for the sake of clarity. As a result, there are additional manageability capabilities that are not discussed here, but are essential to support the model of hosted resources.
The manageability interface is divided into sensors and effectors.
Sensors provide the means for managers to access the state of the manageable resource, either on demand or through notifications, when state changes occur. An example of a sensor is an endpoint that exposes information about the current operational status of a manageable resource and transitions in that operational state.
Effectors provide the means for a manager to affect the state and behavior of the manageable resource. An example of an effector is an endpoint that provides an operation to stop the manageable resource; that is, change its operational status to "stopped".
The sensor and effector interfaces support four unique styles of interaction between the touchpoint and the autonomic manager: Request-response, Send-notification, Perform-operation, and Solicit-response.
The request-response and solicit-response interaction styles are "request-response" flows that return data. For the send-notification and perform-operation interaction styles, the data flow is primarily "one-way." They do not return any data, but can return details about the success or failure of the flow.
WSDM currently does not define a solicit-response interaction style, so this interaction style, if required, must be supported as an extension or viewed as a Web services request-response flow to another service. Touchpoints are not required to support all interaction styles but must support request-response (so that their identity can be retrieved).
Figure 2. Sensors and effectors
Autonomic computing architecture defines a set of manageability capabilities that touchpoints support. From the article "A little wisdom about WSDM:"
A manageability capability is a composable set of properties, operations, events, metadata and other semantics that supports a particular management task. They capture concepts common in most resource information models. Manageability capabilities identify a "contract" that a manageable resource asserts it can offer to clients. A manageability capability can be thought of as an abstract interface, similar to "marker" interfaces in Java or other programming languages. WSDM MUWS defines a set of foundation manageability capabilities for basic management concepts. New or domain specific capabilities may extend existing foundational capabilities as appropriate.
Some manageability capabilities are required, some are optional, and touchpoints are likely to define their own unique capabilities essential for the management of their resources. In all cases, however, the manageability capabilities are to be expressed in a form compliant with the WSDM specification.
In general, sensors are realized as having WSDM metrics, identity, and relationship capabilities. These capabilities provide both properties, accessible through the WS-Resource Properties interface, and notifications containing WSDM Event Format-compliant messages. Effectors are realized as WSDM Configuration capabilities, which define properties that can be set through the WS-Resource Properties interfaces, and other operations that change behavior.
Table 1 describes the manageability capabilities that an autonomic computing touchpoint for a simple manageable resource (MR) implements. Note that in the table there are three namespaces used. The namespace prefixes used in the table indicate the source of the specification for the interface as follows:
- wsdm: defined in WSDM V1.0 standard
- wsrp: defined in WS-Resource Framework (WS-RF), WS-Resource Properties specification.
- actp: defined in the autonomic computing architecture as extensions to published or draft standards. The interfaces indicated in this article may change before they are officially published.
Capabilities are listed as Required (must be implemented by the touchpoint), Optional (may be implemented by the touchpoint), or Strongly recommended (should be implemented by the touchpoint). The Req. column indicates these requirements as "R," "O," or "S," respectively.
Table 1. Touchpoint manageability capabilities
|Interface (PortType)||Req.||Description and notes|
- Properties: ResourceId
|R||This is a fundamental interface that all MRs must implement. Note that the autonomic computing architecture applies some additional constraints related to the uniqueness and nature of this property.|
- Properties: Version, Caption, Description
|R||Description of the resource. Note that the autonomic computing architecture imposes an additional constraint on the mutability of Version. WSDM permits Version to be mutable while autonomic computing architecture does not. Also Version property is Required in autonomic computing architecture.|
- Properties: ManageabilityCapability
|R||Defines properties that provide information about the characteristics of a manageability endpoint implementation rather than the resource. ManageabilityCapability defines all the manageability capabilities offered for the touchpoint. Note that this is optional in WSDM but required by autonomic computing architecture.|
- Properties: CurrentTime
|O||This portType must be implemented if the manageable resource exposes any Metrics.|
- Properties: Resource specific
|O||This portType must be implemented if the manageable resource offers any configuration properties. A configuration property is any resource property exposing a value that, when changed, changes some operational behavior of the resource. The value of a configuration property may be changed directly by a set operation, or may be changed as a side effect of some other operation.|
- Properties: Resource specific representation of state
|R||This portType must be implemented. Note that this is optional in WSDM but required by autonomic computing architecture.|
- Properties: Relationship; Operations: QueryRelationshipsByType
|O||This interface is implemented by manageable resources that offer information about the relationships they have with other manageable resources.|
- Operations: GetResourceProperty
|R||This is a required interface for WS ResourceProperties. All MRs have resource properties, so all touchpoints must implement this.|
- Operations: GetResourceProperty
|R||This is a required interface for WS ResourceProperties. All MRs have resource properties so all touchpoints must implement this.|
- Operations: SetResourceProperty
|S||If a MR has any writeable properties then this interface must be implemented.|
- Operations: QueryResourceProperties
|S||MRs should implement this except in special circumstances, such as very small footprint requirements. In particular, it is important to implement this if properties with potentially high cardinality are exposed by the manageable resource, for example the Relationships property in wsdm:Relationships.|
- Operations: GetMultipleResourceProperties
|S||MRs should implement this except in special circumstances, such as very small footprint requirements.|
|Autonomic Computing Unique|
- Properties: ResourceType, Name
|R||This is a fundamental interface that must be implemented by all manageable resources. See the detailed description below.|
- Properties: ActiveMetric; Operations: ControlMetricCollection
|O||Some metrics can be costly for a resource to collect, so some manageable resource will choose to allow the collection of some metrics to be started and stopped as required. This capability defines metadata that is used to identify metric properties that can be controlled. If any metrics are identified as controllable, then the manageable resource must implement this interface, which provides the mechanism to control the metrics.|
actp:ResourceType: ResourceType is used to provide a classification hierarchy for manageable resources. This classification hierarchy allows autonomic managers to provide various degrees of specialized management functions, depending on the sophistication and knowledge of the manager. For example, a Microsoft Windows™ operating system might have a classification hierarchy of "Operating System," "Windows," "Win32 Operating System," and "Windows XP Professional." Using this classification hierarchy, an autonomic manager that knows how to manage only a "Win32 Operating System" could perform some degree of management, whereas another autonomic manager that knows about the unique capabilities of a "Windows XP Professional" operating system might be able to perform a greater degree of control. Both managers, however, could manage the resource to some extent.
This interface has two properties:
- actp:ResourceType is a set of URIs that identifies the resource types for a manageable resource. A manageable resource can have multiple types. Each type is either in the classification hierarchy of the leaf type of the resource, or it consists of an alternate classification of the manageable resource.
- actp:Name contains a string that is a locally known name of the resource. The name is assigned by the resource and may be unique to each instance of the resource (different resource models may use different schemes for a combination of identification properties that uniquely identify the resource). For example, the Name of an operating system or application server resource could be the hostname of the resource's host (which could be unique for each instance of the resource), rather than the product name of the resource (which typically would not be unique for each instance of the resource).
Autonomic managers use their Monitor function to sense the condition of the resources they manage, and then this information is passed on to the remainder of the control loop (analyze, plan, and execute functions) so that any changes that are necessary to achieve the autonomic manager's goals can be applied. Although implementations that continuously poll resource conditions can be implemented, these tend to be inefficient (owing to polling overhead) or unresponsive (if the polling interval is set to a long time). Event-driven response often provides the most effective balance between these two and, therefore, is likely to be the preferred method.
Although virtually every management system uses events for efficient monitoring of conditions, the autonomic computing architecture recognizes that events produced by the wide variety of IT resources can be classified into a relatively small set of event categories, or situations. By doing so, not only can events be collected efficiently through event notification, but they also can be analyzed efficiently, thereby improving response time and reducing manager processing overhead. Furthermore, with the subsequent simplification of reported events from the disparate resources, the autonomic computing team recognized that correlation of events from multiple resources could be significantly improved and automated, yielding a much higher level of sophistication in the scope of problems that could be analyzed automatically.
IBM's autonomic computing architecture developed the Common Base Event and submitted this work to the WSDM TC for incorporation into the WSDM standard. Although not all of the elements of the Common Base Event were incorporated into the WSDM standard, the structure and classification work significantly influenced the WSDM Event Format.
The WSDM Event Format is an extensible XML format that defines a set of fundamental, consistent data elements that allow different types of management event information to be reported in a consistent manner. The WSDM Event Format enables programmatic processing, correlation, and interpretation of events from different products, platforms, and management technologies.
The WSDM Event Format is organized into three categories for management event data:
- The identifier of the event reporter
- The identifier of the event source
- Situation data
Each category defines a few standard properties that are found in most management events and may be extended to add event and situation-specific data. The situation data includes situation time, situation category, situation disposition priority, severity, message, and substitutable message elements. WSDM also defines a standard set of priorities, severities, and situation categories, such as
StartSituation, StopSituation, and
CreateSituation, to facilitate a common understanding of events received from a variety of resources, resource instrumentation, and management infrastructure. These standard categories allow much more robust problem detection, analysis, correlation, and response.
WSDM supports notifications using WS-Notifications and WSDM Event Formatted messages. WS-Notification provides the publish-subscription services for Web services architectures. Various filtering methods can be employed to allow managers to subscribe based on categories of events for a resource, such as all metric changes or configuration changes rather than subscribing to each independent property change event.
The WSDM Event Format is extensible so that an implementer can map all required fields from the Common Base Event into the WSDM Event.
The WSDM 1.0 standard forms a solid foundation on which self-managing autonomic computing systems can be built and takes a major step forward in improving the manageability and robustness of today's IT infrastructures.
We have shown that WSDM, and the underlying WS-ResourceProperties specification, provide a significant portion of the autonomic computing touchpoint definition, the essential manageability interface required for autonomic computing. WSDM was developed with the future in mind and is extensible so that autonomic computing platforms can use this standard today.
Through WSDM, Web services can now be managed and, equally important, the advantages of Web services technologies can be applied to solving the difficult management integration problem in today's complex, heterogeneous computing environments. Furthermore, with the improved sophistication of management interfaces, technologies such as those resulting from IBM's autonomic computing initiative can help customers enjoy the benefits of self-managing systems using this new standard.
- The IBM whitepaper, "An Architectural Blueprint for Autonomic Computing" provides additional background on concepts addressed in this article.
- Download the "Systems management in a Web services world: Management Using Web Services -- A proposed architecture and roadmap
" whitepaper, which shows you how Web services, leveraging standards, are the key to tie together multiple vendor management system solutions across heterogeneous systems and to begin to simplify the resulting integration challenges.
- Key technologies such as IBM's Common Base Event, which was used as the basis for the WSDM Event Format, can improve the ability to correlate events from multiple resources.
- Web Services Distributed Management: Management using Web Services (MUWS 1.0), Part 1 is a useful OASIS standard. Part 2 is also available from OASIS.
- Heather Kreger has written an excellent introductory article on WSDM titled "A little wisdom about WSDM." (developerWorks, March 2005)
- Web Services Resource Properties 1.2 (WS-ResourceProperties) is a working draft from OASIS.
- An article on developerWorks that provides a solid foundation for understanding SOA is "Service-Oriented Architecture expands the vision of Web services." (developerWorks, April 2004)
- The OASIS technical committees for WSRF, WSDM, and WSN are good resources for developers interested in Web services development.
Heather Kreger is a Lead Architect for Web Services and Management in the Standards and Emerging Technologies area. She is currently co-lead of the OASIS Web Services Distributed Management Technical Committee and member of several related DMTF Work Groups. Heather was the IBM representative to the W3C Web Services Architecture Working Group, as well as co-lead of JSR109 that specifies Web services deployment in J2EE environments and a contributor to the Java Management Extensions (JMX) specification. Heather is also the author of numerous articles on Web services and management in the IBM Systems Journal, Communications of ACM, Web Services Journal, and other public technical work including the “Web Services Conceptual Architecture,” “WS-Manageability,” and her book “Java and JMX, Building Manageable Systems.”
Thomas Studwell is a Senior Technical Staff Member in the IBM Autonomic Computing Architecture organization. Tom is responsible for promoting IBM's Autonomic computing technologies in open industry standards. Tom is a contributing participant in the OASIS Web Services Distributed Management Technical Committee and was responsible for submitting IBM's Common Base Event specification to the WSDM TC. Tom is a member of the IEEE and has a number of patents and publications in computing technologies.