People always ask about failures. That's great that the cloud software can survive failures, but what about my user workloads? Of course the simplest yet all too unsatisfying answer is that your application should be designed to tolerate failures and since the cloud is resilient you can always get more cloud resources. Unfortunately, most people aren't satisfied with this answer. Many enterprise IT folks are used to running expensive servers with very expensive fiber channel attached SAN storage. But what happens with commodity storage exposed over commodity networks and servers?
SCP 1.2 has three kinds of storage 1) gold master images, 2) block storage (volumes), and 3) ephemeral storage. Master images are replicated across a cluster of linux servers. When an instance is created from a master image the guest OS will see a single disk, however, all writes will go to ephemeral storage which is attached to the hypervisor. Although some people do recover the ephemeral storage upon failures, it is designed to be discarded whenever instances are terminated intentionally or otherwise. The master images are replicated for resiliency and scale out performance. For resiliency, we generally establish 2 redundant iSCSI sessions to two separate storage nodes. This can survive network, disk, and storage node failures without affecting the guest workload.
Block storage on the other hand is a bit trickier. We purposely chose not to force redundancy, which turned out to be the cause of amazonocolypse
last spring. We had some early customers tell us that they were sufficiently happy to use RAID storage on their storage nodes so they could recover from a failure even though there would be some down time. Of course other users want their storage to be always available. For those users, we've always recommended allocating multiple volumes for each instance. If you create multiple volumes in one call the cloud will attempt to place each volume on a separate physical storage node. Then using guest level software raid like mdadm for linux or the Windows Disk Management tool you can set up disk mirroring to tolerate failures in one of the nodes. Of course you'll need to monitor for failures so you can re-establish redundancy. You can use Smart Cloud Monitoring
to detect faults and even trigger automated recovery scripts.
While this is an entirely workable solution that is both scalable and low cost, it is still not enough for some use cases. In particular, this solution will not work for "persistent instances". Of course, you should avoid persistent instances, but sometimes, it's just a heck of a lot easier - you don't have to be smart about configuring your windows or linux guest OS. For this scenario we do have some customers combining SCP 1.2 with GPFS
an extremely powerful cluster file system which has been used in some of the world's largest super computer HPC clusters. Using GPFS as the backing store for the SCP storage nodes it is quite simple to automatically failover volumes onto another storage node. In fact, IBM Research has internal prototypes that go even further avoiding any downtime whatsoever as a result of a failed storage node. But I can't tell you about that ;-).
I hope you've found this helpful. I hope you'll agree that there are some pretty good solutions available even if we cannot offer perfection, yet ;-)