AIOps – reducing the cost of downtime

Share this post:

For too long – data has been held captive within our systems of record.  Isolated by the rigidity of platform/application/workload choices, segregated by business line, business function, and data type or initial usage.

The result is splintered views of segmented data that’s difficult to access on the whole, and impossible to attempt to gain true analytical insight from.

Even this only speaks to the snapshot today and current models.  The challenges are compounded as businesses look to change, grow, iterate practices, innovate, or disrupt markets. Attempts at data science, machine learning, and deep learning are made moot by the fact that insights are only as good as the access to supporting data – which again is too fragmented to provide full value.

In order to change this paradigm, a hybrid data management strategy should contain the elements here:

  • Access to all data regardless of source or type
  • The flexibility to support changing workloads and consumption cases
  • Possess intelligent analytics such as machine learning AT the data source
  • Provide access to insights across the business, its functions, and to all users for better decision making
  • An evolution of IT Service Management in the shape of AIOps

So what is AIOps? There are many organisations offering AIOps solutions/services so the what is question but it isn’t as definitive as you may think, for me this best describes AIOps:

  1. The ability to bring together structured and unstructured data from a multitude of different repositories, applications and services to find the hidden gems that can add value to both the business and IT Operations.
  2. Train a machine learning solution so in the future automatic remediation of issues are carried out prior to them happening as the tell-tale signs of something starting to fail has already been identified and thus remediated.

There are many areas where AIOps will evolve todays IT Operations, but I wanted to look at one area that will make the business sit up and listen to the IT department – downtime!

Planned or unplanned, because downtime = money (that’s lost money/revenue as well as customer dissatisfaction, market share and possibly brand damage etc) these are some of the knock on effects of downtime.

The following is an all to common occurrence – IT Operations get a notification there is a problem, in some cases it can take them almost 5 hours and 17 separate steps across 4 different tools to diagnose the issue with approximately 10 people being involved with solving the incident. According to Aberdeen The (rising) cost of downtime industry report, an average incident can cost $260k per hour, and there are others, that have been well reported across the press that cost much, much more than this.

AIOps looks at all these different siloed data channels in real time, looking for important signals across structured and unstructured data types.

It groups events together based on spatial and temporal reasoning as well as similarity to past situations and synthesizes a holistic incident report.

That report is surfaced automatically in ChatOps (Slack, Zoom etc) to give the IT Operations key information as soon as it’s available:

  1. there’s a problem that IT Operations need to pay attention to.
  2. a pointer to where the problem is and other services that might be affected.
  3. evidence and advice to diagnose and resolve the situation.

With AIOps, this same workflow can take less than 15 minutes and almost all the work happens within ChatOps – IT Operations don’t have to jump from tool to tool losing time with context switching and don’t need to be in the same incident room!
The result – costs are radically reduced and instead of 10 people working on this for hours, one or two IT staff can confidently do their job with the information they need, delivered by the AIOps algorithms.

In summary – AIOps correlates disparate data across your environments/applications to derive hidden insights and help you identify incident root causes faster. Eliminating the need for multiple dashboards, insights and recommendations are fed directly into your existing workflows so you can rapidly resolve IT incidents.

If you want to learn more about IBM AIOps then join us at London Tech Week where Alex Signoret will be running a session on how AI for IT improves business outcomes, leads to increased revenue and lowers both cost and risk for organisations.


Red Hat Synergy Team (AM&I Cloud Paks)

More stories
By Charles Lupton on 16 October, 2020

Unlocking the potential of industrial digitalisation with 5G

Innovative thinkers across Britain are using their creativity to harness the power of 5G to boost economic productivity, reduce pollution and congestion, and develop the next generation of entertainment. In the North of England the focus is firmly on accelerating the growth of advanced manufacturing across the region. There are many possibilities for innovation and […]

Continue reading

By Paul Farrell on 9 October, 2020

COVID-19 recovery with Artificial Intelligence

Accelerating COVID-19 recovery with Artificial Intelligence As organisations across Ireland are looking to move forward and apply the lessons learnt from the first wave of the pandemic, they are faced with an overwhelming amount of information from within their business. Or ironically, a lack of the information they actually need to solve problems and help […]

Continue reading

By Zeeshan Ahmed on 9 October, 2020

CBM4Cloud Workshops

CBM4Cloud Workshops Since the turn of the year, alongside my usual job responsibilities, I have found myself running Component Business Model for Cloud (CBM4Cloud) workshops with our Public Sector clients. For those who are not aware, a CBM4Cloud workshop is a method that IBM uses (with or without a client) to understand the client landscape […]

Continue reading