How Multi-NIC CNI enables you to provide automated and scaled network parallelism to your HPC workloads in Red Hat OpenShift on IBM Cloud.
Red Hat OpenShift on IBM Cloud has offered a fast and secure way to automate the deployment and scaling of containerized enterprise workloads since 2019. With continuous and rapid development, this year, we announced that it will also serve as an HPC cluster for resource-intensive workloads. Such workloads are typically serverless computing, analytics, big data (Hadoop and Spark) and machine learning.
The key benefits of HPC infrastructure are high throughput with automated, scaled parallelism of compute, storage and networking building blocks. However, there are still some challenges, and the most critical one is related to network performance. In this blog post, we will go into more detail about the network bottleneck and how to mitigate it.
HPC convergence on the cloud container platform and its network bottleneck barrier
The convergence of HPC workloads on the container platform in the cloud has attracted much attention, with promising flexibility, scalability and cost-effective benefits. In the meantime, an overlaying network bottleneck is a big barrier. Although enabling access to multiple network devices (Multi-NIC) is a direct way to top up the network bandwidth, routing container IP packets across multiple network interfaces is a mess. It requires many manual steps and expertise, especially when we want to remove the overlay stack of the container platform and utilize the full bandwidth.
Furthermore, unlike dedicated static systems, there are many dynamic changes during the operating time on the cloud infrastructure. For example, the cluster can be automatically scaled by demand. The worker nodes may join and leave the cluster or become unavailable by any chance. Additional network devices or newly-introduced network technology can be added to the node at any point in time. Even for the existing network devices, they might be unconnected by failures from time to time.
What is Multi-NIC CNI?
Multi-NIC CNI is a project that implements the CNI (Container Network Interface)—an incubating project accepted by Cloud Native Computing Foundation (CNCF)—to deliver a simple, automated and scaled Multi-NIC network solution to the container platform on the cloud infrastructure, starting from Red Hat OpenShift on top of IBM Cloud. Multi-NIC CNI can help you deal with the multi-network complexity and dynamic changes and make your container networks on Cloud ready for the HPC and AI workloads, like a native system. Currently, the Multi-NIC CNI is available on OperatorHub.io for the general Kubernetes community and embedded in the OperatorHub in OpenShift and OKD.
Get started with Multi-NIC CNI
To get started with Multi-NIC CNI for HPC and AI on Red Hat OpenShift in IBM Cloud, complete the following steps.
Step 1: Get the multi-network cluster ready
The first step is to build an OpenShift Cluster on IBM Cloud infrastructure with the openshift-installer. Check out this article for more information.
After the cluster is ready, you can simply create and attach the secondary networks with the provided Terraform script here.
Step 2: Install the operator from the OperatorHub
Step 3: Deploy MultiNicNetwork CR
Create MultiNicNetwork CR:
Apply it to your cluster:
Wait until it gets ready:
If it completes successfully, you should see something like this:
Step 4: Test the connection
Then, you should see something like this on your screen:
This blog post highlighted the network bottleneck barrier of HPC convergence on Kubernetes-based cloud container platforms and introduced Multi-NIC CNI, a container network interface plugin that can be used to deliver network parallelism and satisfy the collective needs of HPC workloads.
Red Hat OpenShift on IBM Cloud pioneered the creation of HPC cloud solutions and can be further enhanced by taking advantage of the use of Multi-NIC CNI with no modification requirement to the routing tables and IP management of the underlay infrastructure. We are looking forward to expanding this solution to other platforms and being widely adopted here and there. The Multi-NIC CNI operator is now an open-source project and is integrated to Operator Hub community. Future collaborations and contributions are more than welcome to build this solution toward a hybrid cloud era.
Try out the solution on an IBM Cloud OpenShift HPC cluster today.