The drawback of easy management interfaces - or why the cloud is inevitable
seb_ 060000QVK2 Visits (5445)
In the last couple of years beside of the buzzwords "cloud", "big data" and "VAAI" there is another topic that plays a big role in every discussion about storage products: "easy management". In most cases it means an intuitive and catchy graphical user interface that would allow even children to manage a storage array - if you believe marketing. Along with that goes the integration of storage management tasks into the GUI of the servers temselves and of course automation of these tasks. If the highly skilled server and storage administrators don't have to invest their time into disproportionately laborious routine tasks anymore they could focus on more advanced projects.
But many companies still fight the impacts of the financial crisis. This leads to: vacant posts get dropped, teams get consolidated and cut down. The CIOs want to see the synergy effects in numbers and decreasing headcounts. Former specialized experts have to cope with more and more different systems. Less time, more work, less education, more stress, less productivity, more trouble - a downward spiral. Beside of that classical admin's work is offshored or outtasked to operating and monitoring teams with no more than broad, general skills.
In the technical support I see the effect of that "evolution" in the problem descriptions of current cases: "We see SCSI messages in the host." Or even just "We see messages. Could be the SAN." Administrators with a foundational ITIL certificate but no clue about what a Read(10) is are suddenly confronted with a host running amok with just some obscure rough messages about its storage in the logs. To ensure a quick resolution of the problem priority 1 would be to know what these messages actually mean. Often they are just forwarded from the device driver and there is no good documentation available explaining it properly. Or there is just something like "blabla ...then go to your service provider", not even mentioning which one - out of the broad bouquet one with a heterogenous infrastructure might have - this would be. If the admin lacks a fundamental understanding about the storage concepts and protocols then, he will not be able to get any senseful information out of that. And "randomly" has to pick a support organization for any of the involved machines.
The result: Long & critical outages.
So the colorful dynamic easy-to-use management interfaces protected us from the ugly technical abyss in the lower layers for the longest time. But now as there is a problem, we only get some strange sense data and don't know who could help us further. And it's the same with managing changes in the infrastructure. A lot of the problems opened at the SAN support are in fact mis-configurations, user mistakes or unrealistic expectations born out of conceptual misunderstandings. "We need this 300km synchronous mirror connection to run with 3ms latency max. We bought your enterprise SAN gear. Why is it not fast enough?". The same with slow drain devices. If a SAN admin (with also the server admin's and storage admin's hat on his head) has no idea about the traffic flow in a SAN and buffer-to-buffer credits, how could he understand the impact of a slow drain device in his environment?
That's why clouds and Storage aaS, IaaS or even SaaS are so important today. Not because of the elastic and dynamic deployment or the transparancy of the costs. But because there are less and less people with deep technical background knowledge about storage and SANs available in the companies. They seemed to be superfluous as long as everything was running fine and an un-skilled person was enough to make the few clicks in the GUI. So the only escape and the consequent next step is to move to the cloud concept.
Am I a cloud fanboy?
I wouldn't call me a fanboy. I'm a support guy and I like to troubleshoot as effectively as possible to solve a problem as quickly as possible. And to enable me doing this, I need a skilled local counterpart who is able to collect the data and to execute the action plans, who is also able to address problems to the proper support provider and to proactively monitor the environment. So if there is a classical data center with a team of skilled administrators, I'm quite happy. But if not, this "vacuum" should be filled to minimize the risk of major outages. The provider of a public cloud would have such a team.
And in private clouds?
In a well-defined and highly automatized private cloud, the remaining (most probably much smaller) team of skilled admins doesn't have to care for provisioning of LUNs and other standard tasks anymore. They would have more time for digging deeper into the stuff. You might argue now that this just repeats the story of the easy management above. Right! But as soon as you entered this path and as long as the external constraints don't change, this is the only way to go. And for some of the companies out there a private cloud might just not be the best choice and other options like outsourcing would come into play.
The most important thing is to face the truth and to make a honest review of the skills available. Your data is your most precious asset and availability is crucial. If that path leads to the cloud, there is no reason to stop now. Don't wait for the next outage!