Accelerating research and innovation
How NCHC uses AIOps to improve public network services and proactively prevent outages
In the Conference Room Chief Engineer Presents to a Board of Scientists New Revolutionary Approach for Developing Artificial Intelligence and Neural Networks. Wall TV Shows Their Achievements.

The speed of research matters. During the COVID-19 pandemic, it’s been the difference between life and death for millions.

In Taiwan, where the pandemic response has been exceptionally effective at limiting outbreaks and death, the National Center for High-performance Computing (NCHC) helps accelerate research and innovation nationwide by providing access to supercomputers and analytics and by facilitating nationwide networks for data sharing and collaboration.

Although NCHC supports research in all disciplines, the urgency of the pandemic inspired it to launch successive “Tech v Virus” programs, which call for universities, research organizations, enterprises and startups to find new ways to fight the spread of the SARS-CoV-2 coronavirus. One high-profile breakthrough so far is a stethoscope that visualizes a patient’s breathing, helping doctors and nurses reduce close contact with potentially infected patients—thus reducing risk of transmission. Another is a map of the COVID-19 gene’s evolution, helping predict routes of spread.

To support efforts like these, and hundreds of others in all fields, NCHC wants to ensure that research moves as fast as it can. That’s why it continues evolving its Taiwania series of supercomputers, which includes one of the 50 most powerful computers in the world. That’s why it provides AI services—including tools based on IBM Cloud Pak® for Data. And that’s why NCHC recently worked with the IBM Garage™ to implement the IBM Cloud Pak® for Watson AIOps solution, applying AI-based automation to maximize resilience and performance.

Reduced MTTD

 

Reduced mean time to detect (MTTD) by 55% for service-impacting issues

Predictability

 

Identifies potential outages 25 hours earlier than before

By feeding structured and unstructured data into the solution’s AI Manager component, NCHC and the IBM Garage team were able to train AI models to automatically, and proactively, manage problems and incidents.
Cutting through IT Ops complexity

Taiwan has several major public computing networks that crisscross the country and allow researchers to share information and collaborate. Some of the networks are specialized for academia, some for government and some for industry. But increasingly—especially in response to the COVID-19 pandemic—research initiatives have demanded cross-discipline efforts and cross-network collaboration. Fast information sharing between the public networks is crucial.

So NCHC began a new initiative: building a central network exchange. But bringing the networks together presented a new layer of challenges. The different networks were equipped with a disparate array of monitoring tools and data log sources and formats. The complexity complicated management, which kept NCHC from quickly filtering alarms to detect significant issues and prevent outages. Outages, in turn, would impede data sharing and collaboration across the networks.

To fulfill the purpose of the central exchange—accelerating nationwide research collaboration—NCHC needed a way to cut through the complexity of IT operations management. It turned to AIOps.

Predictive maintenance with AIOps

As part of its search for a solution, NCHC worked with the IBM Garage to run a proof of concept (POC) based on IBM Cloud Pak for Watson AIOps software.

The goal of the POC was to gauge the real-world impact of the potential solution. NCHC provided operations data and networking log data from real-life scenarios—where some networking equipment is breaking down and would create outages, for example.

The NCHC and IBM teams then used IBM Cloud Pak for Watson AIOps as a central integrator of the network exchange’s diverse array of IT operations tools, producing a holistic view of the entire infrastructure. And by feeding structured and unstructured data into the solution’s AI Manager component, NCHC and the IBM Garage team were able to train AI models to automatically, and proactively, manage problems and incidents.

The results were excellent. The teams achieved a 55% shorter mean time to detect (MTTD) issues that would affect service.

Based on the success of the POC, NCHC and the IBM® Customer Success Manager team deployed IBM Cloud Pak for Watson AIOps into the exchange center production environment. NCHC now uses the following components of IBM Cloud Pak for Watson AIOps:

  • AI Manager: to ingest structured and unstructured data and train AI models to proactively manage problems and incidents. All alerts generated by AI Manager are published as a story in a ChatOps interface that NCHC staff use as the single source of truth for monitoring the exchange center.
  • Event Manager: to import all network device logs via a pre-defined batch program, and to reduce network noise with event grouping, which will reduce operational costs significantly.
  • Metric Manager: to ingest all network device metric data, such as CPU, memory and disk usage, and provide a holistic view of device statuses.

 

 

Driving ongoing discovery and innovation

The MTTD reduction means that NCHC can detect potential outages 25 hours earlier than it could before—helping NCHC see and resolve the outages before they occur.

So far, these impressive results have come in response to common, known problems. NCHC knows that unique, unexpected issues will arise and provide new tests for the solution, but the organization expects similar results. Ultimately, NCHC expects that its adoption of AIOps will help keep information channels open so that research projects across Taiwan have the critical data they need to keep making progress toward discovery and innovation.

NCHC logo
About National Center for High-performance Computing (NCHC)

With the mission of promoting scientific discovery and technological innovation, Taiwan’s NCHCExternal Link (link resides outside of ibm.com) provides the country’s government agencies, higher education institutions and industries with supercomputing services, high quality networking, high efficiency storage, big data analysis and scientific engineering simulations. NCHC is headquartered in Hsinchu City.

Take the next step

To learn more about the IBM solutions featured in this story, please contact your IBM representative or IBM Business Partner.

View more case studies Contact IBM TIME dotCom

Building the engine of a rocketing economy

 

Read the case study
T-Mobile

AI-powered automation in the US’s largest 5G network

Read the case study
Electrolux

A legendary innovator brings AIOps to its global enterprise

Read the case study
Legal

© Copyright IBM Corporation 2022. IBM Corporation, New Orchard Road, Armonk, NY 10504

Produced in the United States of America, March 2022.

IBM, the IBM logo, ibm.com, IBM Cloud Pak, and IBM Garage are trademarks of International Business Machines Corp., registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the web at ibm.com/legal/copyright-trademark.

This document is current as of the initial date of publication and may be changed by IBM at any time. Not all offerings are available in every country in which IBM operates.

The performance data and client examples cited are presented for illustrative purposes only. Actual performance results may vary depending on specific configurations and operating conditions. THE INFORMATION IN THIS DOCUMENT IS PROVIDED “AS IS” WITHOUT ANY WARRANTY, EXPRESS OR IMPLIED, INCLUDING WITHOUT ANY WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND ANY WARRANTY OR CONDITION OF NON-INFRINGEMENT. IBM products are warranted according to the terms and conditions of the agreements under which they are provided.