IBM Cloud Pak® for Watson AIOps has partnered with IBM Research and customers to help make AI more explainable.

In the IBM Cloud Pak® for Watson AIOps, we have artificial intelligence (AI) that helps clients and users manage their applications, IT infrastructure and services. It’s AI that can use log, metric, topology, event and ticket data and chat history to learn normal behaviour and help customers avoid issues, resolve them faster when they do occur and automate resolutions.  But how do you trust that the artificial intelligence is doing what you want it to do?

Establishing a foundation of trust

For a start, we work closely with our colleagues in IBM Research — one of the largest industrial research organizations in the world.  We embrace “inner source” — the sharing of ideas and technology, developing them together for the common good of our customers.

From a data science perspective, there are a lot of tests that can be performed to determine the accuracy of AI. Those tests often rely on data sets that have an associated “ground truth,” which indicates the expected behaviour and measures how well the AI can replicate it. The data is typically provided by clients who work closely with us to help meet the desired use cases. This way, when other clients use our AI, they are using something that has been tested and validated with real-world data and honest feedback.

Insights, understanding and decisions

Ultimately, however, the best way to ensure that our users trust AI is by making it easily explainable. We present insights, and allow users to understand how that decision was reached.

Example 1: Temporal correlation

In the following example, we have presented a group of events that tend to co-occur, using our Temporal Correlation algorithm. To help build trust, a user can drill down to a view that shows them why we made the decision to group them together. Every green line represents an occurrence of the event, and the user can immediately see that most of the time, these events occur together. 

The strength of the algorithm can be seen in that we don’t need 100% overlap of events to determine that they tend to occur together. You can see this in the chart, where sometimes the events don’t occur together, but the relationship is still discovered:

Example 2: Metric anomaly

In this insight, we highlight that an anomaly has occurred because a metric, Number of Active Connections, “is now a flat line, where before it was varying.” The user can drill down into a view like the one shown below to view the history of the metric over time, together with a baseline and a red zone indicating precisely where the anomaly is occurring. The user can see that, previously, the metric has occasionally had a value of zero, but now it is at zero for much longer than normal. This is a good indication that the service has been interrupted or stopped:

Example 3: Seasonal events

For our final example, we use AI to highlight when events are occurring at non-random times. Knowing that an event occurs with a certain regular frequency is a good indication that you might be fixing the same problem over and over again. This is something that should be automated away — or the underlying cause addressed once and for all. It might highlight that this event is just noise that experienced operators know to ignore, so it would be good to filter it out altogether. To build trust, the user can drill down, where we present simple concise statements and easy-to-understand visualisations, as shown in the following diagram:

Knowing that this event seems to always occur on Fridays between 2pm and 3pm is good information. The user can also see it is not occurring at any other time. Through explainable AI, the user can build trust that other events enriched like this are doing what is expected.

Summary

Why is trust so important? The primary goal of AI is to help make our lives better and more efficient. If you trust the AI, you will be more likely to put it to use. When you are confident that the AI is doing what you expect and you can understand it, then you feel confident knowing your time is well spent taking action, investigating, triaging and automating the resolution — avoiding incidents, resolving them faster and resolving them automatically the next time they occur.

Let us help you build trust in IBM Cloud Pak® for Watson AIOps.

Was this article helpful?
YesNo

More from Cloud

Top 6 innovations from the IBM – AWS GenAI Hackathon

5 min read - Eight client teams collaborated with IBM® and AWS this spring to develop generative AI prototypes to address real-world business challenges in the public sector, financial services, energy, healthcare and other industries. Over the course of several weeks, cross-functional teams comprising client teams, IBM and AWS representatives worked to design, develop and iterate on prototypes that push the boundaries of what's possible with generative AI. IBM used design thinking and user-centric approach to guide the teams throughout the hackathon. AWS provided…

IBM + AWS: Transforming Software Development Lifecycle (SDLC) with generative AI

7 min read - Generative AI is not only changing the way applications are built, but the way they are envisioned, designed, tested, documented, and deployed. It’s also revolutionizing the software development lifecycle (SDLC). IBM and AWS are infusing Amazon Bedrock generative AI capabilities into the IBM® SDLC solution to drive increased efficiency, speed, quality and value in every application lifecycle consistently and at scale. The evolution of the SDLC landscape The software development lifecycle has undergone several silent revolutions in recent decades. The…

How digital solutions increase efficiency in warehouse management

3 min read - In the evolving landscape of modern business, the significance of robust operational and maintenance systems cannot be overstated. Efficient warehouse management helps businesses to operate seamlessly, ensure precision and drive productivity to new heights. In our increasingly digital world, bar coding stands out as a cornerstone technology, revolutionizing warehouses by enabling meticulous data tracking and streamlined workflows. With this knowledge, A3J Group is focused on using IBM® Maximo® Application Suite and the Red Hat® Marketplace to help bring inventory solutions…

IBM Newsletters

Get our newsletters and topic updates that deliver the latest thought leadership and insights on emerging trends.
Subscribe now More newsletters