Using Language Models to Optimize IT Operations Management in Watson AIOps

4 min read

As IT complexity grows and the use of AI technologies expands, enterprises are looking to bring in the power of AI to transform how they develop, deploy and operate their IT.

Our past work on Sentiment Analysis and Entity Recognition has shown that artificial intelligence (AI) models customized with cross-lingual data on top of Language Models outperform those that are trained on general-purpose data alone. We were curious to see if we could replicate similar results while solving problems like anomaly predictions in the IT Operations Management domain. So, we conducted experiments to test this hypothesis. In this article, we share our experimental results in which we note that the anomaly prediction models built with advanced Language Models that are trained with IT data as features outperform the ones built with general-purpose data.  


Language Models are critical components in Natural Language Processing (NLP). They can learn to predict the probability of a sequence of words. A 1-gram language model predicts the probably of a single missing word in a sentence. For example, in the sentence “Ana _ to get a book to read,” an English-language-trained Language Model might predict the word ‘went’ to fill in the dash with a probability of 99%.

A 2-gram language model predicts the probability of a sequence of two missing words at a time. For example, in the sentence “Ana _ _ get a book to read,” a trained Language Model might predict the word sequences ‘went to’ or ‘had gone’ — each with a probability of 95%. This can be extrapolated to n-grams.

In order to perform this task, internally, in Language Models, words are converted to real number vector representations because it is easier for mathematical models to operate on numbers. These are called Word Embeddings or Word Vectors. These Word Embeddings are widely used in NLP tasks.

To create Word Embeddings, words or phrases from the vocabulary of a language are mapped to vectors of real numbers, and each word or phrase is associated with a feature vector of a fixed dimension. Typically, Embeddings are pre-trained on large text corpora such as Wikipedia, Twitter tweets, news articles, etc., and are tested on Language Modeling tasks, which assign a probability distribution over sequences of words.

An IT operations environment generates many kinds of data. These include metrics, alerts, events, logs, tickets, application and infrastructure topology, deployment configurations, and chat conversations, among others. Our goal in this experiment is to pre-train Language Models with IT domain vocabulary that occurs in logs, tickets, metrics, alerts, events, and chats — for example, errors, exceptions, messages, service names, server names, pods, container ids, node ids, incidents, tickets, root cause, causal factor and topology, etc. Word Embeddings derived from such IT domain-specific Language Models could serve as richer features for the machine-learning-based AI models in our system.

Applying Language Models to Log Anomaly Prediction in IBM Watson AIOps

In IBM Watson AIOps, there are many AI pipelines for processing different types of data and generating insights from them. For example, application and infrastructure logs and metrics are parsed and processed to predict anomalies early in the process. These are handled by Log Anomaly and Metric Anomaly Prediction models, respectively.

Anomalies that are raised and other events and alerts that may be generated via rules are then grouped into their corresponding incident buckets by leveraging various techniques, including entity linking and spatial, temporal, and topological algorithms to reduce event noise. This is done by Event Grouping AI models. Faults are diagnosed and localized by Fault Localization AI models. The set of impacted components are noted by Blast Radius AI models. Similar incidents from the past incident records are identified and next-best-actions are derived by Incident Similarity AI models.

Each one is an AI model that employs different algorithms. Some are deep-learning algorithms, and some are unsupervised machine-learning algorithms. The features used in all these models could benefit from a deeper understanding of IT domain. Figure 1 shows our approach to using language models for different IT operations management prediction tasks:

Figure 1: An illustration of language models for different IT Operations management prediction tasks.

Figure 1: An illustration of language models for different IT Operations management prediction tasks.

Anomaly detection from logs is one fundamental IT Operations management task that aims to detect anomalous system behaviors and find signals that can provide clues to the reasons of a system's failure. In our experiment, we tested whether anomaly detection models built with features derived from Word Embeddings from the Language Models trained on IT data outperform the ones that are built with the general-purpose technologies.  

To pre-train language models in the IT Operations domain, we first process the input IT data into a normalized format using pre-defined rules — extracting the most informative texts, such as log messages, ticket descriptions, and so on. We also remove duplicates of texts, which may be auto-generated multiple times by the system for the same event. Next, we randomly sample data from each data source and use the data samples to learn the vocabulary of the IT Operations domain. After that, we pre-train the Language Model using the sampled data and tune the parameters based on model evaluation. An overview of the pre-training pipeline is shown in Figure 2:

Figure 2: The pipeline of pre-training language models using IT Operations domain data.

Figure 2: The pipeline of pre-training language models using IT Operations domain data.

We trained a number of anomaly detection models using different pre-trained features. In Table 1, we report the accuracy results of anomaly prediction on two benchmark datasets for two models — one is a machine-learning model trained with fastText Word Embeddings that are trained on general purpose data (e.g., Wikipedia, news articles, etc). The other one is a machine-learning model built using embeddings trained with diverse IT Operations domain data as features. Our experimental results indicate that the fastText model customized with IT domain logs outperforms the AI model built using Language Models with domain-independent, general-purpose data on both the datasets:

Table 1: Accuracy results on log anomaly detection with pre-trained features from embeddings. FastText model trained with IT domain data outperforms the one that is trained with fastText domain independent model.


As IT complexity grows and the use of AI technologies expands, enterprises are looking to bring in the power of AI to transform how they develop, deploy and operate their IT. IBM Watson AIOps adopts a new approach to leverage advanced Language Models for IT Operations tasks, such as log anomaly prediction. With the power of Watson AIOps, we can accelerate the development of text-based AI models for optimizing IT Operations management tasks at a large scale.

Be the first to hear about news, product updates, and innovation from IBM Cloud