Skip to main content

If you don't have an IBM ID and password, register here.

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. This profile includes the first name, last name, and display name you identified when you registered with developerWorks. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

All information submitted is secure.

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

Symptoms deep dive, Part 3: Classify your symptoms

Introducing a standard taxonomy of autonomic computing symptoms to help identify situation categories

Marcelo Perazolo (mperazol@us.ibm.com), Autonomic Computing Architecture, IBM, Software Group
Marcelo Perazolo is a member of the IBM Autonomic Computing Architecture team, where he serves as an architect for symptoms and other knowledge formats and defines Management Integration Taxonomies related to autonomic computing. He has worked for IBM since 1990, with various assignments in network and systems management. Marcelo received an M.S. degree in Electrical Engineering in 1994. His interests include problem determination and prediction, process optimization techniques, security, correlation technologies, and knowledge representation.

Summary:  To identify symptoms, a standard symptom taxonomy is an excellent starting point -- although it is not the only tool you need for this task -- because it provides a common framework with which symptoms authors can expand and promote the reuse of their individual symptoms in a more standardized way. This article introduces you to a standard taxonomy of autonomic computing symptoms used to categorize the types of situations described by the symptom. It also presents the methodology used for the identification of these categories, a methodology that also applies when new symptoms are discovered and when new categories need to be created or assigned to the symptoms. I'll also discuss some best practices for deciding whether a taxonomy needs to be extended.

View more content in this series

Date:  02 May 2006
Level:  Intermediate

Comments:  

A standard symptom taxonomy is a good starting tool for identifying and categorizing symptoms. It offers symptoms authors a common framework with which they can expand and promote the reuse of their individual symptoms in a more standardized way. In the third article in this series, I revisit the autonomic computing symptoms architecture and make a more detailed examination of the parts of that architecture that promote the classification of symptoms. I also introduce a method that can be used to identify symptom categories and I present a set of standard categories that were identified when I applied this method to sample problem determination data from multiple symptom sources. (For references on symptoms, their format and content, please see Resources.)

A closer look at symptom elements

In the overall autonomic computing architecture, symptoms are a form of knowledge (see Resources) -- as such they convey information necessary for the analysis, identification, and resolution of situations handled by an autonomic manager. Figure 1 illustrates an overview of the autonomic computing symptoms reference architecture.


Figure 1. The autonomic computing symptoms reference architecture
The autonomic computing symptoms reference architecture

This architecture is composed of the following main elements:

  • Symptom metadata (the one I'll concentrate on in this article)
  • Symptom schema
  • Symptom rule
  • Symptom effect
  • Symptom definition
  • Symptom instance
  • Correlation engine
  • Symptom catalog

Before I move on to the topic of this article, let's look at each element in a little more detail.

Symptom metadata
Symptom metadata is common information present in all kinds of management knowledge; because a symptom is one of the types of knowledge supported by an autonomic manager, it must contain knowledge metadata. This includes things like a type, a category, an identifier, and so on.
Symptom schema
Symptom schema is information specific to the symptom only, not present in other knowledge types. For example, a symptom defines a hierarchy and must support that in its schema with attributes like a root cause parent symptom, as well as children symptoms. Other data includes things like a symptom probability, a priority, a description, and so on.
Symptom rule
Symptom rule defines how a symptom is recognized. It may be anything, depending on the correlation technology used to process the data and events that will give origin to the symptom, but usually it can be represented by normalized patterns or rules.
Symptom effect
The symptom effect defines the reaction to be performed by the autonomic manager after a symptom is recognized. An effect could be something immediate like "restart a router" or a further analysis of application dependency to be performed by the analysis layer. It can also be a textual recommendation intended for human consumption.
Symptom definition
The symptom definition is a collection of the four symptom elements -- metadata, schema, rule, and effect -- used just for grouping and cataloging purposes.
Symptom instance
After a symptom is identified, an instance of the information defined by the symptom schema is created. The symptom instance links with the symptom definition that was used for its creation to access metadata, rules, and effect elements.
Correlation engine
The correlation engine is the logical entity responsible by processing symptom definitions, extracting rules, and creating symptom instances. Symptom instances are the result of the processing of symptom rules by the correlation engine.
Symptom catalog
The catalog is the distributed component used to store, consume, and reuse symptom definitions. This is a multivendor, multisolution repository of information and contains downloadable symptoms that can be consumed by management tools.

For the purposes of this article, I'll concentrate on the metadata element; in particular, on the attribute used for symptom classification, which is part of the metadata. This attribute is important for efficient run-time processing of symptom instances and it should always be of utmost importance for symptom authors as well. A solid classification generally assures smoother processing when symptoms are identified in an autonomic manager. It also assures better composition of symptoms when further analysis is necessary for the creation of incidents, problems, and impact records.

The main symptom categories

In the course of identifying canonical categories that may be applied to symptoms, there are multiple considerations. As you know, there are many ways of categorizing management information, and a symptom is a form of management information.

Symptoms are positioned as being composite events; in other words, special events that are derived by the composition of other forms of management information emitted by manageable resources. Such forms of information may be (but are not restricted to):

  • events (normalized or not)
  • log and trace records
  • static application data
  • metric records

As such, one valid way to categorize symptoms would be with respect to the sources of composed information that are part of the symptom, but because the symptom instance already carries the correlation trail of a symptom (pointers to the components of the symptom), this categorization alternative is unnecessary.

Other valid ways to categorize symptoms would be to take into account where a resource is sensed, what a resource will affect, or even which resources produced the information used to compose the symptoms in the first place. This would be an example of a scope-centric categorization, but because this information is also already present in the symptom metadata, it does not add much value. Applications can infer the types of scope-centric categorization by looking at the symptom scope.

On the other hand, functional categorization is a useful form of categorization that could also be applied to symptoms. Symptoms could benefit from a functional categorization to indicate to applications which of the various IT processes and services a particular symptom applies to. As such, the standard form of categorization adopted by the symptoms reference architecture is a functional categorization. The following lists other predefined main categories for symptoms:

  • Security
  • Operation
  • Availability
  • QoS (Quality of Service)

As you can see, it's a short list, but there are many secondary categories you can derive from these main categories. Don't forget that this is a starting set and as such, can and should be expanded.

Now let's explore each of these main categories in more detail.

Security symptoms

Symptoms in this category describe security problems; Table 1 shows the existing security subcategories.


Table 1. Security symptoms sub-categories
CategoryDescription
PreventionProblems related to the prevention of security problems:
  • Virus definition update failed
  • Virus detected; quarantine failed
  • Virus detected; cleanup failed
  • Spam detected from existing metrics
AuthenticationProblems related to authentication of users and messages in a system:
  • Wrong password
  • User substitution failed
  • User password changed
  • User profile changed
AuthorizationProblems related to the authorization of actions or user access:
  • Network connection denied
  • Untrusted resource (file, program, and so on)
  • Licensing replication error
  • No license available
  • Access denied

Listing 1 demonstrates the XML schema for the security category.


Listing 1. XML schema for the security category
 <simpleType name="Security">
  <restriction base="QName">
   <enumeration value="security:prevention"/>
   <enumeration value="security:authentication"/>
   <enumeration value="security:authorization"/>
  </restriction>
 </simpleType>

For good examples of canonical symptoms in this category (security symptoms),

  • Authentication Failure
  • Authorization Failure
  • Prevention Deployment Failure

please see the second article in the Symptoms deep dive series (see Resources for a link to the article).

Operation symptoms

Symptoms in this category describe operation problems; Table 2 highlights existing operations sub-categories:


Table 2. Operations symptoms sub-categories
CategoryDescription
ExecutionProblems related to the operation of the system:
  • Job execution failure
  • Recoverable job failure
  • Scheduled job aborted
  • Scheduled job failed
LogicProblems related to the business logic of a system:
  • Business logic alarm
  • Repeated business logic alarm
ConfigurationProblems related to the system configuration:
  • Configuration resource unavailable
  • Configuration resource is invalid
  • Installation problem
  • Wrong version

Listing 2 demonstrates the XML schema for the operations category.


Listing 2. XML schema for the operations category
 <simpleType name="Operation">
  <restriction base="QName">
   <enumeration value="operation:execution"/>
   <enumeration value="operation:logic"/>
   <enumeration value="operation:configuration"/>
  </restriction>
 </simpleType>

For good examples of canonical symptoms in this category (service support symptoms):

  • Configuration Unavailable
  • Configuration Invalid
  • Dependency Unavailable
  • Dependency Mismatch

please see the second article in the series.

Availability symptoms

Symptoms in this category describe availability problems; Table 3 defines existing availability sub-categories:


Table 3. Availability symptoms sub-categories
CategoryDescription
StorageProblems related to the availability of storage resources:
  • Storage device degraded
  • Storage space exhausted
  • Storage allocation problem
I/OProblems related to the availability of I/O resources:
  • Repeated I/O access problem
  • Repeated I/O device problem
NetworkProblems related to the availability of network resources:
  • Node down
  • Node unreachable
  • Interface down
  • Network marginal
  • Network critical
CommunicationProblems related to the availability of communication resources:
  • Connection broken
  • Time out of synchronism
  • Event queue failure
HardwareProblems related to the availability of hardware resources:
  • Power failure
  • Memory allocation failure
  • Processing problem
  • Temperature threshold exceeded
SoftwareProblems related to the availability of software resources:
  • Server down
  • Server restart failed
DataProblems related to the availability of data:
  • Database replication error
  • Database transaction dropped
  • Database rollover
  • Database consistency failure
  • Database batch processing error

Listing 3 shows the XML schema for the availability category.


Listing 3. XML schema for the availability category
 <simpleType name="Availability">
  <restriction base="QName">
   <enumeration value="availability:storage"/>
   <enumeration value="availability:io"/>
   <enumeration value="availability:network"/>
   <enumeration value="availability:communication"/>
   <enumeration value="availability:hardware"/>
   <enumeration value="availability:software"/>
   <enumeration value="availability:data"/>
  </restriction>
 </simpleType>

For good examples of canonical symptoms in this category (service availability symptoms):

  • Resource Capacity Met
  • Resource Unavailable
  • Resource Degraded
  • Resource Unreachable
  • Repeated Availability Problem

please see the second article in this series.

QoS symptoms

Symptoms in this category describe quality of service problems; Table 4 shows these existing sub-categories:


Table 4. QoS symptoms sub-categories
CategoryDescription
MetricsProblems detected by the analysis of existing metrics associated to a system:
  • Number of queued messages threshold exceeded
  • Storage quota threshold exceeded
  • Processing threshold exceeded
  • Utilization threshold exceeded
PerformanceProblems detected by the performance analysis of a system:
  • Server overload
  • Cache memory exhausted
  • Message delay

Listing 4 gives you the XML schema for the QoS category.


Listing 4. XML schema for the QoS category
 <simpleType name="QoS">
  <restriction base="QName">
   <enumeration value="qos:metrics"/>
   <enumeration value="qos:performance"/>
  </restriction>
 </simpleType>

Good examples of QoS symptoms can be derived from the analysis of QoS agreements and monitoring data in networks and applications. Typically, these are threshold-oriented symptoms in which a threshold-met or -surpassed situation usually means that a QoS parameter was violated. Symptoms will reflect these QoS parameter violations and associated resolutions will be taken by the symptom processors.

Extending the symptom taxonomy

Symptoms are generally stored in symptom catalogs; as such, they provide a common medium for distribution of symptom information. As well, symptom catalogs may choose to publish their defined symptom categories for reuse purposes.

It is a best practice for a symptoms author to consult and reuse these categories whenever possible. The same also applies to whole symptom definitions -- when possible, reuse should be encouraged.

The following method generally applies for the classification of symptoms in the standard taxonomy or for the expansion of the taxonomy:

  1. Look at the existing main symptom categories. If the symptom fits in any of the existing main categories, then proceed. Otherwise, a new main category should be created.
  2. If creating a new main category, it is a best practice to align these categories with standard ITIL-oriented functional processes. For example, a symptom that signals a continuity of service problem would be related to a new category of Continuity of Service Symptoms.
  3. Search the symptom secondary categories for a fit. If the symptom fits the description of any of the existing secondary categories, then proceed. Otherwise, a new secondary category should be created.
  4. Secondary categories are also functional, but they may denote organizational aspects of the process. One such example would be the types of QoS metrics that may be evaluated (for example, performance metrics, availability metrics, and so on). When a metric type does not exist in the QoS main category, you can create a new one.
  5. If the symptom fits in an existing secondary category, look for similar symptoms. If many similar symptoms exist in a secondary category, the symptom author may choose (as an option) to create tertiary (or even deeper) categories and group his symptom, along with other existing similar symptoms, in such derived categories. In this case, a reorganization of existing symptoms may be necessary (in this article, I have not provided analysis of the inherent difficulties associated with the reorganization of symptom categories -- consider it an important task, not to be undertaken lightly).

After symptoms are created and correctly classified they may be imported into symptom catalogs and start being part of the analysis, detection, and resolution process that makes use of symptom definitions in an autonomic manager.

Conclusion

Autonomic computing symptoms provide good value for identification and resolution of situations in an autonomic computing environment, but in order to be more effective, symptoms should be correctly classified so they can be applied to the specific context to which they are related in the overall analysis of IT processes. There are many ways to classify symptoms; in this article, I've laid out a methodology and associated best practices based on functional decomposition of symptoms and the resources they affect.

Whenever possible, reuse of canonical symptoms and their respective taxonomy should be encouraged. A standard starting set of symptom categories exists along with a methodology for their expansion. New categories may and will be added when more and more symptoms are authored in production or pre-production environments. It is important that the philosophy associated to the taxonomy and the subsequent classification of symptoms be followed because only this will guarantee a smooth and efficient processing strategy for realizing the power of symptoms in an autonomic manager.


Resources

Learn

Get products and technologies

  • IBM trial software: Build your next development project with trial software, available for download directly from developerWorks.

Discuss

About the author

Marcelo Perazolo

Marcelo Perazolo is a member of the IBM Autonomic Computing Architecture team, where he serves as an architect for symptoms and other knowledge formats and defines Management Integration Taxonomies related to autonomic computing. He has worked for IBM since 1990, with various assignments in network and systems management. Marcelo received an M.S. degree in Electrical Engineering in 1994. His interests include problem determination and prediction, process optimization techniques, security, correlation technologies, and knowledge representation.

Report abuse help

Report abuse

Thank you. This entry has been flagged for moderator attention.


Report abuse help

Report abuse

Report abuse submission failed. Please try again later.


developerWorks: Sign in

If you don't have an IBM ID and password, register here.


Forgot your IBM ID?


Forgot your password?
Change your password


By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. This profile includes the first name, last name, and display name you identified when you registered with developerWorks. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

(Must be between 3 – 31 characters.)


By clicking Submit, you agree to the developerWorks terms of use.

 


Rate this article

Comments

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Tivoli
ArticleID=110151
ArticleTitle=Symptoms deep dive, Part 3: Classify your symptoms
publish-date=05022006
author1-email=mperazol@us.ibm.com
author1-email-cc=

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

For articles in technology zones (such as Java technology, Linux, Open source, XML), Popular tags shows the top tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), Popular tags shows the top tags for just that product zone.

For articles in technology zones (such as Java technology, Linux, Open source, XML), My tags shows your tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), My tags shows your tags for just that product zone.

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).