Virtual image library fingerprints data

Share this post:

Vasanth Bala, Manager of
Scalable Datacenter Analytics

Editor’s note: This article is by Vasanth Bala, a staff scientist at IBM’s Thomas J. Watson Research Center.

It’s inevitable. Servers crash. Applications misbehave. Even if you troubleshoot and figure out the problem, the process of problem diagnosis will likely involve numerous investigative actions to examine the configurations of one or more systems—all of which would be difficult to describe in any meaningful way. And every time you encounter a similar problem, you could end up repeating the same complex process of problem diagnosis and remediation.
As someone who deals with just such scenarios in my role as manager of the Scalable Datacenter Analytics Department at IBM Research, my team and I realized we needed a way to “fingerprint” known bad configuration states of systems. This way, we could reduce the problem diagnosis time by relying on fingerprint recognition techniques to narrow the search space.
Olive’s role
CMU and other academic organizations now manage virtual image library. For their work, CMU’s Mahadev Satyanarayanan and Gloriana St. Clair received a two-year grant from the Sloan Foundation “to support the technical development of a platform for archiving executable content and the environment in which it runs, as well as a plan for the institutionalization and ongoing sustainability of work for such an archive.”
This follows on the heels of a comparable grant from the Institute for Museum and Library Sciences, for CMU to develop Olive’s archiving system for executable content.
Project Origami was thus born from this desire to develop an easier-to-use problem diagnosis system to troubleshoot misconfiguration problems in the data center. Origami, today a collaboration between IBM Open Collaborative Research, Carnegie Mellon University, the University of Toronto, and the University of California at San Diego, is a collection of tools for fingerprinting, discovering, and mining configuration information on a data center-wide scale. It uses public domain virtual image library, Olive, an idea created under this Open Collaborative Research a few years ago.
It even provides an ad-hoc interface to the users, as there is no rule language for them to learn. Instead, users give Origami an example of what they deem to be a bad configuration, which Origami fingerprints and adds to its knowledge base. Origami then continuously crawls systems in the data center, monitoring the environment for configuration patterns that match known bad fingerprints in its knowledge base. A match triggers deeper analytics that then examine those systems for problematic configuration settings.
How Origami works
Together with Carnegie Mellon University and the University of Toronto, we developed agent-less system crawlers that are able to continuously scan the configuration state of virtual servers – without requiring any scanning agents to be installed inside them. Think about these crawlers as analogous to web crawlers that silently and non-intrusively scan the contents of web documents to build a central index that can then be searched or mined for insight.
This crawling approach improves usability and security because: there is no scanning agent to install and maintain on tens of thousands of systems; and there is no agent for malware present within these systems to attack. We are now developing advanced fingerprinting technologies that use a concept called “search by example,” where the user provides an example of a problematic configuration, rather than using a complex rule language to declaratively define the details of the problem.
Such a “search by example” can also be created by first crawling a system; making some change to it that represents a configuration adjustment; then re-crawling the system, and finally asking Origami to compute the difference between the two crawled states of the system. This technique allows users to provide arbitrary system changes as examples.
What’s happening inside Origami during all of these processes? It internally computes a fingerprint of the example and stores it in a fingerprint knowledge base. A fingerprint is a collection of hashes that summarize different dimensions of the configuration data for very fast recognition. Various heuristics then adjust the relative weights of different features comprising the fingerprint so that important features (e.g. a network port being opened) are
distinguished from less important ones (e.g. a log file being modified). These heuristics lower false alarms so bad configuration patterns can be distinguished from very similar patterns that are actually benign.
What’s next for Origami?
The overriding question for us is how Olive and Origami together can lead to the production of commercially viable technologies. The above-mentioned problem diagnosis of misconfiguration-related outages is one clear use for the search-by-example technology that we have developed with our OCR partners.
Another technology using Origami, under development with the University of California at San Diego, would mine many different systems in the data centers that are identically configured to automatically learn patterns that tend to produce problems from those that tend to operate well.
More stories

A new supercomputing-powered weather model may ready us for Exascale

In the U.S. alone, extreme weather caused some 297 deaths and $53.5 billion in economic damage in 2016. Globally, natural disasters caused $175 billion in damage. It’s essential for governments, business and people to receive advance warning of wild weather in order to minimize its impact, yet today the information we get is limited. Current […]

Continue reading

DREAM Challenge results: Can machine learning help improve accuracy in breast cancer screening?

        Breast Cancer is the most common cancer in women. It is estimated that one out of eight women will be diagnosed with breast cancer in their lifetime. The good news is that 99 percent of women whose breast cancer was detected early (stage 1 or 0) survive beyond five years after […]

Continue reading

Computational Neuroscience

New Issue of the IBM Journal of Research and Development   Understanding the brain’s dynamics is of central importance to neuroscience. Our ability to observe, model, and infer from neuroscientific data the principles and mechanisms of brain dynamics determines our ability to understand the brain’s unusual cognitive and behavioral capabilities. Our guest editors, James Kozloski, […]

Continue reading