Special clouds for special needs: High Performance Computing Clouds

Share this post:

Today, we are used to speaking about desktop or test clouds, or maybe moving our email or CRM to cloud-based solutions. What do all these workloads have in common? They have some of the main characteristics to be a good candidate for cloud computing, as pointed out by Luis Aguilar in his post Migration to cloud: It is all about workloads. But, beyond that, they all are related to enterprise workloads with low or non-predictable demand for computing resources, do not generate a significant amount of data during application run, and are loosely coupled to the infrastructure.

But what if we need to deal with a workload that looks like exactly the opposite?

A workload that seems to be a more predictable, demanding a big bunch of resources, with a utilization average rate in the 80 to 90 percent range, and uses or creates huge amounts of data during application run?

Well, at first sight it could seem that we are facing a workload not suitable for cloud computing. And, we can find all those characteristics in most high performance computing (HPC) workloads.

Does it mean that HPC cannot be approached using a cloud computing model?

No. It rather means it should be approached with some differences, some specific considerations.

If we look at the evolution of HPC delivery, putting it really simply, it went from a single system to an HPC cluster, and then to an HPC grid. Also, during the first conversations about cloud computing, there were a lot of comparisons between it and the concept “grid computing,” many of them because of cloud’s characteristic of having a big or heterogeneous pool of automated resources that would provide huge scalability. Now, we already know that cloud computing means much more: on-demand self-service, standardization, and usage-based chargeback/billing. It seems cloud computing represents a perfect evolution of grid computing for HPC, and it could help address the ever increasing flexibility and capacity demands of deep scientific, technical and analytical tasks.

So, what would be the differences with a “general-purpose cloud?”

First of all, you might be thinking “what do you mean by a ‘general-purpose cloud’?”

A cloud comprised of general-purpose servers, using virtualization and general-purpose cloud management software to enable better flexibility and higher utilization, with an aim to hit the price/performance sweet spot.

Although one of these clouds could accommodate some of the so-called embarrassingly parallel workloads, in some cases they won’t meet the special needs of HPC workloads.

Typically, HPC clusters are designed to use the full compute potential of installed hardware, without the overhead introduced by the hypervisor, and most HPC applications require network-accessible storage. In other cases, additional or different elements are introduced, such as low-latency networks, Cell processors, GPUs, or even FPGAs; to boost the performance.

With that in mind, among the specific characteristics of HPC clouds, I would consider:

  • Bare metal provisioning, which is the ability to provision physical machines (not only virtual machines, VMs)
  • Higher vertical scalability
  • High-performance software stacks, ready to use
  • Network/clustered file systems support
  • Accelerated clusters (Cell processors, GPUs, FPGAs, and others)
  • Scale out to public clouds for certain workloads (embarrassingly parallel)

And what about the benefits?

The benefits of an HPC cloud compared to traditional HPC deployments would be those of cloud computing compared to traditional IT:

  • Shared pool of resources: Ability to easily repurpose all in-house nodes
  • Ease of manageability and access to HPC infrastructure through a self-service web portal
  • Centralized user management, usage metering and accounting
  • Single-point submission and monitoring for multiple job queues
  • Greater user satisfaction: Reduced provisioning times
  • Better accommodation of load peaks by moving temporarily some workloads to the public cloud.
  • Potential ability to become an HPC cloud service provider and sell unused compute power during load valleys

And what do you think? What other differences do you see in HPC clouds?

More stories

Why we added new map tools to Netcool

I had the opportunity to visit a number of telecommunications clients using IBM Netcool over the last year. We frequently discussed the benefits of have a geographically mapped view of topology. Not just because it was nice “eye candy” in the Network Operations Center (NOC), but because it gives an important geographically-based view of network […]

Continue reading

How to streamline continuous delivery through better auditing

IT managers, does this sound familiar? Just when everything is running smoothly, you encounter the release management process in place for upgrading business applications in the production environment. You get an error notification in one of the workflows running the release management process. It can be especially frustrating when the error is coming from the […]

Continue reading

Want to see the latest from WebSphere Liberty? Join our webcast

We just released the latest release of WebSphere Liberty, It includes many new enhancements to its security, database management and overall performance. Interested in what’s new? Join our webcast on January 11, 2017. Why? Read on. I used to take time to reflect on the year behind me as the calendar year closed out, […]

Continue reading