IBM Systems Lab Services

Building a culture of high availability

Share this post:

What customer doesn’t expect the applications they use to work all the time?

What business doesn’t want high availability from its IT systems?

In today’s world, both customers and businesses have high expectations. Customers want to bank and shop when it’s convenient for them, and business leaders want the systems providing these services to be always available. But delivering high availability — even with the latest advancements in technology — can remain elusive, because technology alone can’t provide it.

Yes, the mean time between failures on some IT components can now be measured in decades, but we should never forget the adage that “eventually all hardware will break, and eventually all software will work.” Businesses invest in reliable technology, but some stop there. Their technology reliability expectation becomes their availability plan. We come to depend on those services, expecting them to always be there, so what happens when something does break? Is your business ready for that?

If simply buying better technology isn’t the answer, what is it? It’s simple really. Assume everything can fail, and some of it will. Create an IT culture in which everyone keeps asking (and answering) these questions:

  • What can I do to minimize failures?
  • What will happen if something does fail?
  • How do I minimize the impact when it fails?

High availability requires the right recipe of technology, people and processes built around a culture that not only supports high availability but strives for it. Without this, IT organizations will forever be putting a band aid on their outages.

How to create a culture of high availability

A culture of high availability has to start at the top of the business, which requires organizational objectives that support the goal of achieving near-zero downtime. It means everyone in IT is driven to achieve zero downtime, including architecture design, application development, system administration, operations and so on. An application team that’s driven to roll out new features won’t be focused on exploiting the resiliency in their technology. A system administration team that’s not given maintenance windows won’t be able to keep firmware current. An operations team that only monitors components may not find broken services until the customer calls.

This doesn’t mean everyone in IT owns service availability. There should be a role within the IT organization for that. That owner needs to be proactively focused on building, maintaining and delivering highly available services.

Organizations must make the proper investment to achieve zero downtime. All applications are not created equal, and they don’t necessarily require the same investment, but the business and IT must have a clear understanding of what value each service brings and how to invest in each to deliver what’s expected. Does the business side of your company understand the cost of downtime for the services they deem critical? If not, how can they be sure they’ve made the right investment? The services that the business expects to be “always on” require the right investment.

Continuous improvement

A strong service management framework can help set the stage for continuous improvement; however, truly achieving it will depend on the culture of the organization. Does your organization have objectives that lead to continuous improvement? Do those targets change over time? Are all failures (even those that don’t trigger a service outage) inspected to see what can be improved? Does your business have the right metrics in place to provide a warning when things are headed in the wrong direction before an outage occurs?

Experience has shown that many IT outages can be traced back to process errors. In some cases, the percentage of outages from process errors can be as high as 50 percent. Ignoring, or failing to fix, the process errors or gaps you have experienced can only lead to them reoccurring.

Although there are several service management frameworks, a common one seen in IT today is the Information Technology Infrastructure Library (ITIL). While ITIL may not provide all the answers, adopting a particular service management framework with proper education and strong management support can help speed the creation of the right culture for achieving high availability.

Where to start

One way you can assess where your business stands is to use an independent team to review and assess your technology landscape and service management framework. This can help you identify gaps and single points of failure and determine what actions will close those gaps.

The High Availability Center of Competency (HACoC) in IBM Systems Lab Services is a team built exactly for this purpose. To contact us, please send us an email.

More IBM Systems Lab Services stories

Designing an efficient SAP HANA on Power Systems landscape

IBM Systems Lab Services, Power servers, Power Systems

Most workloads in data centers today are virtualized to drive up system utilization. This results in several benefits for your business: fewer systems needed, a decrease in network/SAN ports, reduced floor space, power and cooling, and most importantly reduced costs for IT management. With SAP HANA on IBM Power Systems, you can have more flexibility ...read more


Get the most from your IBM Z and LinuxONE systems—and fast

IBM Systems Lab Services, LinuxONE solutions, Mainframes

Cloud + security + resiliency. You want it all for your business’s IT infrastructure. You want to take advantage of the latest technologies while maintaining your existing investments, all in an integrated IT environment. All the while, there are only so many hours in a day. Sometimes, onsite help from a consultant with deep technical ...read more


Top IBM Power Systems myths: “IBM AIX is dead and Unix isn’t relevant in today’s market” (part 2)

AI, IBM Systems Lab Services, Power Systems

In part 1 of this series, we started to look at the myth that IBM AIX and Unix are no longer relevant. We talked about the Unix wars that began in the 1980s and how the market has evolved since then. Now, let’s consider the evolution of AIX in the past few decades and the ...read more