AIOps – reducing the cost of downtime

Share this post:

For too long – data has been held captive within our systems of record.  Isolated by the rigidity of platform/application/workload choices, segregated by business line, business function, and data type or initial usage.

The result is splintered views of segmented data that’s difficult to access on the whole, and impossible to attempt to gain true analytical insight from.

Even this only speaks to the snapshot today and current models.  The challenges are compounded as businesses look to change, grow, iterate practices, innovate, or disrupt markets. Attempts at data science, machine learning, and deep learning are made moot by the fact that insights are only as good as the access to supporting data – which again is too fragmented to provide full value.

In order to change this paradigm, a hybrid data management strategy should contain the elements here:

  • Access to all data regardless of source or type
  • The flexibility to support changing workloads and consumption cases
  • Possess intelligent analytics such as machine learning AT the data source
  • Provide access to insights across the business, its functions, and to all users for better decision making
  • An evolution of IT Service Management in the shape of AIOps

So what is AIOps? There are many organisations offering AIOps solutions/services so the what is question but it isn’t as definitive as you may think, for me this best describes AIOps:

  1. The ability to bring together structured and unstructured data from a multitude of different repositories, applications and services to find the hidden gems that can add value to both the business and IT Operations.
  2. Train a machine learning solution so in the future automatic remediation of issues are carried out prior to them happening as the tell-tale signs of something starting to fail has already been identified and thus remediated.

There are many areas where AIOps will evolve todays IT Operations, but I wanted to look at one area that will make the business sit up and listen to the IT department – downtime!

Planned or unplanned, because downtime = money (that’s lost money/revenue as well as customer dissatisfaction, market share and possibly brand damage etc) these are some of the knock on effects of downtime.

The following is an all to common occurrence – IT Operations get a notification there is a problem, in some cases it can take them almost 5 hours and 17 separate steps across 4 different tools to diagnose the issue with approximately 10 people being involved with solving the incident. According to Aberdeen The (rising) cost of downtime industry report, an average incident can cost $260k per hour, and there are others, that have been well reported across the press that cost much, much more than this.

AIOps looks at all these different siloed data channels in real time, looking for important signals across structured and unstructured data types.

It groups events together based on spatial and temporal reasoning as well as similarity to past situations and synthesizes a holistic incident report.

That report is surfaced automatically in ChatOps (Slack, Zoom etc) to give the IT Operations key information as soon as it’s available:

  1. there’s a problem that IT Operations need to pay attention to.
  2. a pointer to where the problem is and other services that might be affected.
  3. evidence and advice to diagnose and resolve the situation.

With AIOps, this same workflow can take less than 15 minutes and almost all the work happens within ChatOps – IT Operations don’t have to jump from tool to tool losing time with context switching and don’t need to be in the same incident room!
The result – costs are radically reduced and instead of 10 people working on this for hours, one or two IT staff can confidently do their job with the information they need, delivered by the AIOps algorithms.

In summary – AIOps correlates disparate data across your environments/applications to derive hidden insights and help you identify incident root causes faster. Eliminating the need for multiple dashboards, insights and recommendations are fed directly into your existing workflows so you can rapidly resolve IT incidents.

If you want to learn more about IBM AIOps then join us at London Tech Week where Alex Signoret will be running a session on how AI for IT improves business outcomes, leads to increased revenue and lowers both cost and risk for organisations.


Red Hat Synergy Team (AM&I Cloud Paks)

More stories
By Carmel Duffy on 19 October, 2021

Building a supply chain fit for the future

See how you can use technology to transform your supply chain and support sustainability and transparency. Imagine that you’re preparing an order with products for customers with nut allergies. Moments after clearing the order for delivery, one of your suppliers gives you bad news: the batch of products that you just sent were cross contaminated […]

Continue reading

By Vikki Bradney-Spencer on 13 October, 2021

How SkillsBuild is helping a British army veteran build a new cybersecurity career

As IBM announces its commitment to skill 30 million people globally by 2030, I’m sharing an inspiring SkillsBuild story from the UK. Men Gurung, a former soldier with the Brigade of Gurkhas is using IBM SkillsBuild to help fulfill a long-held ambition to work in cybersecurity. IBM has long been committed to help former Service […]

Continue reading

By Danielle Arney on 12 August, 2021

Behind the scenes: life as an IBM Business Management Degree Apprentice

Behind the scenes: life as an IBM Business Management Degree Apprentice This summer students across the UK are receiving GCSE and A Levels exam results and considering post school options. To help make the decision process easier and explore the range of options available we are sharing ‘behind the scenes’ blogs from apprentices at IBM. […]

Continue reading