Do you have a self-healing cloud?

Share this post:

Recently, a business analyst shared her organization’s experience replacing its Human Resources (HR) application. The company is global with approximately 6,000 employees in offices in North America, Europe, and Asia Pacific. For many years, the Human Resources application and servers had been supported internally by the IT department. However, within the past year, a decision was made to move it to a cloud-based Human Resources application consumed as a software as a service. When I asked her “How was it going,” she responded with three valuable lessons.

The first lesson is to ensure that “end-to-end” service levels are understood and can be met. She evaded me when I asked “Who was the ultimate decision maker to move to the cloud – HR or IT?” So I concluded that the decision had not been made collaboratively and judging by her gestures that IT was not totally in favor of it. In any event, she went on to tell me that the cloud-based Human Resources application availability was awful, with frequent long outages. Judging by her gestures, it sounded like HR was blaming IT.

She went on to tell me that the “Cloud HR application was self-healing.”

I was amused with her statement and asked her to explain it. Well, as it turns out, “self-healing” was considered to be very bad. The outage duration regularly exceeded 60 minutes. When users were unable to access the HR application, they were instructed to call the help desk. The help desk service is very basic – open a ticket and follow the script.

The first step in the script was to call the infrastructure team to make sure there were no internal network problems. The second step was to call the application support team. The third step was for the application support team to call the cloud HR application provider. The incident resolution process was so convoluted that before the cause could be identified, the failing component had already fixed itself – without any known human intervention! When moving to cloud-based SaaS, make sure that you understand how the new service will work “end-to-end” and set realistic service level agreements (SLAs).

The second lesson is to ensure that the organization is prepared for changes in the way people work. In the good old days, IT provided user application training whenever a new application was being rolled out. IT set and met the rollout schedule. As part of the move to the cloud, she was working on training manuals for the new HR application. At this point, she mentioned that the cloud-based HR application was architected where a number of customers share the same application code. And, that was the root of the problem. The SaaS provider set the application code update schedule for all of the customers and new code changed the way the application worked. These changes were frequent and implemented before she had a chance to update the training manuals and train the users. The result – an incredibly frustrated management team right from first level managers all the way up to the CEO. I don’t know what to say about the situation other than make sure you understand how people work today and how people will work tomorrow, and make sure you have a detailed transition plan that smooth over the bumps.

The third lesson is to have clearly documented roles and responsibilities. One of the responsibilities HR has is to advertise vacant positions. In the “web world,” that often means posting the position externally on a website that matches job seekers with hiring organizations. She was almost in tears at this point when she explained that job postings did not always appear. And when they didn’t, nobody was accountable to fix it. How crazy is that? I’d heard enough and thought it was time to change the subject…

I hope you will learn from this organization’s experience and plan for your “self-healing cloud” – with one difference: I wish that yours operates flawlessly and delights your users.

Distinguished Engineer and Senior Cloud Advisor

More stories

Why we added new map tools to Netcool

I had the opportunity to visit a number of telecommunications clients using IBM Netcool over the last year. We frequently discussed the benefits of have a geographically mapped view of topology. Not just because it was nice “eye candy” in the Network Operations Center (NOC), but because it gives an important geographically-based view of network […]

Continue reading

How to streamline continuous delivery through better auditing

IT managers, does this sound familiar? Just when everything is running smoothly, you encounter the release management process in place for upgrading business applications in the production environment. You get an error notification in one of the workflows running the release management process. It can be especially frustrating when the error is coming from the […]

Continue reading

Want to see the latest from WebSphere Liberty? Join our webcast

We just released the latest release of WebSphere Liberty, It includes many new enhancements to its security, database management and overall performance. Interested in what’s new? Join our webcast on January 11, 2017. Why? Read on. I used to take time to reflect on the year behind me as the calendar year closed out, […]

Continue reading