Share this post:
In my last blog, I talked about some of the growing challenges facing operations teams looking to maximize the availability of services and applications while minimizing the cost of doing so.
Enter analytics and machine learning. Why would operations teams care about them?
Picture an incident first responder in an operations team – let’s call her Annette. She needs the most relevant events presented to her in meaningful context with as little noise as possible. She needs to be able to see the woods for the trees, so that she can resolve a problem indicated by an event as quickly as possible.
Now imagine Brock, a site reliability engineer with deep knowledge of an app, service or supporting technology. Brock may not have the time to author event reduction and correlation rules for Annette’s benefit. But the difference between him getting, say, a single incident SMS per day and a handful of notifications could mean a world of difference. It could mean he spends a day either grepping logfiles and examining irrelevant events or successfully preparing for his team’s next app rollout. Brock will want to know whether there are chronic problems in the managed environments that he doesn’t see because he might not have time to sift through the management data.
Neither Brock nor Annette are data scientists. They don’t need to care about machine learning. But they care about what such technologies can do for them. Netcool Operations Insight (NOI), delivered by the Netcool/OMNIbus, introduces machine learning technology to give Brock and Annette a data scientist in a box that aims to help solve operations management problems.
How NOI delivers insights
With NOI’s related event analytics, Brock can enable correlations that will group statistically related events into a single incident. For Annette, this means she can forego dealing with dozens of apparently unrelated events. Instead, she would see just a handful of true incidents, each containing relevant context for identifying the underlying cause. It can also mean that Brock gets a single SMS for, rather than many alerts throughout the day—or in the middle of the night.
NOI’s seasonality analysis shows Brock events that occur in a predictable pattern in time, helping him identify and remedy persistent problems in the managed infrastructure. When they go unidentified, chronic problems can lead to significant hidden costs from replugging the same hole time and time again.
NOI’s event and log search analysis provides Annette and Brock with critical contextual data from informational events and log files. For example, NOI can help identify an out-of-band configuration change as the root cause of a cascade of symptomatic events, all from the Netcool event console. Brock can look for hotspots for his application or service—say, an unreliable software module or a consistently faulty hardware model.
So why should Annette and Brock trust the output of these analytics? Neither are data scientists. In developing these capabilities, IBM development teams have ensured that, when an analytically derived insight is produced, the software can produce supporting evidence that the insight is valid. How do I know these events are correctly grouped? How do I know this event is chronic? Because the software can show me the event instances in history that support this conclusion. Annette and Brock don’t need a PhD in artificial intelligence to develop trust in the system and see that the generated insights are valid.
Adding cognitive and machine learning capabilities to Netcool helps the IT operations organization effectively deal with dramatically increasing numbers of events from highly complex hybrid environments. Along with integrated offerings such as Predictive Insights, Agile Service Manager and Runbook Automation, NOI helps companies to move into the area of cognitive automation. This means transforming IT operations from a people-led and technology-assisted approach to one that is technology-led and people-assisted.
To learn more, register for our webinar on the value predictive insight brings to IT operations. Check out the earlier posts in our IBM Operations Analytics blog series. And stay tuned for additional key learnings from our colleagues in coming weeks.