People Always Ask Me...
trossman 120000GJ9V Visits (3317)
People always ask about failures. That's great that the cloud software can survive failures, but what about my user workloads? Of course the simplest yet all too unsatisfying answer is that your application should be designed to tolerate failures and since the cloud is resilient you can always get more cloud resources. Unfortunately, most people aren't satisfied with this answer. Many enterprise IT folks are used to running expensive servers with very expensive fiber channel attached SAN storage. But what happens with commodity storage exposed over commodity networks and servers?
SCP 1.2 has three kinds of storage 1) gold master images, 2) block storage (volumes), and 3) ephemeral storage. Master images are replicated across a cluster of linux servers. When an instance is created from a master image the guest OS will see a single disk, however, all writes will go to ephemeral storage which is attached to the hypervisor. Although some people do recover the ephemeral storage upon failures, it is designed to be discarded whenever instances are terminated intentionally or otherwise. The master images are replicated for resiliency and scale out performance. For resiliency, we generally establish 2 redundant iSCSI sessions to two separate storage nodes. This can survive network, disk, and storage node failures without affecting the guest workload.
Block storage on the other hand is a bit trickier. We purposely chose not to force redundancy, which turned out to be the cause of
While this is an entirely workable solution that is both scalable and low cost, it is still not enough for some use cases. In particular, this solution will not work for "persistent instances". Of course, you should avoid persistent instances, but sometimes, it's just a heck of a lot easier - you don't have to be smart about configuring your windows or linux guest OS. For this scenario we do have some customers combining SCP 1.2 with GPFS an extremely powerful cluster file system which has been used in some of the world's largest super computer HPC clusters. Using GPFS as the backing store for the SCP storage nodes it is quite simple to automatically failover volumes onto another storage node. In fact, IBM Research has internal prototypes that go even further avoiding any downtime whatsoever as a result of a failed storage node. But I can't tell you about that ;-).
I hope you've found this helpful. I hope you'll agree that there are some pretty good solutions available even if we cannot offer perfection, yet ;-)