August 9, 2022 By Gabor Samu 3 min read

Across many industries, organizations rely on high performance computing (HPC) to drive their core mission, delivering higher-quality, innovative products to market faster.

As we forge ahead in the era of artificial intelligence (AI), organizations are looking to leverage AI technologies to accelerate time to market and business decisions. AI is often seen as something independent from traditional HPC workloads. However, the reality is that AI methods are increasingly being applied in HPC, with AI-infused and AI-guided HPC being two such techniques.

Joining forces: HPC and AI

Whether for AI-infused or AI-guided HPC, data is the common denominator in the race to deliver higher-quality products to market faster. Organizations today have the ability to capture vast volumes of data from a variety of sources, including sensor and IoT data. Furthermore, organizations often have a wealth of data that has been acquired over decades from running traditional HPC simulation and modeling workloads. Using these sources of data, HPC and AI can be applied to the same problems to deepen insights and innovation.

Let’s consider some examples where AI-infused and AI-guided HPC are being used to tackle problems faster and with greater accuracy ever. AI-infused HPC involves applying AI methods to analyze the output of simulations. AI-guided HPC is the application of AI to reduce the problem space for HPC simulations. This is sometimes referred to as intelligent simulation, where Bayesian methods are applied.

AI-infused HPC in electronic design automation (EDA)

As part of modern semiconductor design, billions of verification tests must be run to validate chip designs. For example, the IBM POWER10 processor has 18 billion transistors. Typically, semiconductor-design companies must book time at a foundry for production far ahead of time. If an error is found during the validation phase, it’s not practical to re-run the entire set of billions of verification tests. Using AI-infused HPC methods can help identify the tests that need to be re-run, thereby saving a significant amount of compute cycles and helping to keep the manufacturing timelines on track.  

AI-guided HPC in automotive design

In the automotive industry, the design of vehicles and components often evolve from previous designs. During the design process of a new model, there are millions of potential changes and optimizations which can be considered to improve characteristics like aerodynamics, noise, vibration and harshness (NVH) and structural stiffness, just to name a few. However, assessing all these potential changes over different road conditions and parameters would significantly increase the cycle time between models. Automotive manufacturers have significant bodies of knowledge about existing designs, and they are exploring how to train AI models based upon these large bodies of data in order to rapidly determine the best areas for vehicle optimization. This approach helps to significantly reduce the problem space and allows manufacturers to focus traditional HPC methods on more targeted areas of the design. Ultimately, the goal is to produce a better-quality product, in a shorter amount of time.

The challenge of scale

As organizations scale up AI environments in support of their HPC practice, those environments closely resemble those used for traditional HPC workloads. High-speed interconnects, accelerators and high-performance parallel filesystems are de rigeur for both AI and HPC workloads. Due to the coupled nature of AI-infused and AI-guided HPC workloads, organizations are looking at ways to run these workloads on a common infrastructure to capitalize on economies of scale.

Furthermore, new classes of users are now appearing alongside traditional HPC users, including data scientists, engineers and researchers. Often, these users of modern HPC environments are domain experts and not IT or HPC infrastructure experts. Therefore, organizations are always looking for ways to make it easy for users to run their work and get results, while ensuring that the compute resources including GPUs are well utilized.

Managing a converged infrastructure for HPC and AI

IBM Spectrum LSF Suite is a high-performance, highly scalable workload and resource management solution for demanding HPC environments. LSF Suite helps to simplify the user experience and improve utilization in HPC environments for traditional HPC simulation and modeling, virtual engineering, digital twins and AI-infused and AI-guided HPC. LSF Suites supports workloads on-premises, in the cloud and hybrid cloud, as well as support for containerized, GPU, machine learning and deep learning workloads.

Advanced support for NVIDIA GPUs in LSF Suite means that organizations can drive utilization by simplifying administration with powerful features, including the following:

  • Automatic detection and configuration of GPUs
  • Automatic compute mode selection
  • Automated configuration of CUDA_VISIBLE_DEVICES
  • Full isolation and access control
  • GPU power management and auto-boost support
  • GPU fairshare and GPU preemptive scheduling
  • Integrated metric collection and reporting
  • Integrated NVIDIA DCGM, MPS support
  • Dynamic workload-driven NVIDIA Multi-Instance GPU support

IBM Spectrum LSF Suite is certified as part of the DGX-Ready Software program, ensuring a proven, validated solution for demanding commercial HPC environments running NVIDIA DGX systems. Learn more about IBM Spectrum LSF Suites here.

More from Announcements

IBM Hybrid Cloud Mesh and Red Hat Service Interconnect: A new era of app-centric connectivity 

2 min read - To meet customer demands, applications are expected to be performing at their best at all times. Simultaneously, applications need to be flexible and cost effective, and therefore supported by an underlying infrastructure that is equally reliant, performant and secure as the applications themselves.   Easier said than done. According to EMA's 2024 Network Management Megatrends report only 42% of responding IT professionals would rate their network operations as successful.   In this era of hyper-distributed infrastructure where our users, apps, and data…

IBM named a Leader in Gartner Magic Quadrant for SIEM, for the 14th consecutive time

3 min read - Security operations is getting more complex and inefficient with too many tools, too much data and simply too much to do. According to a study done by IBM, SOC team members are only able to handle half of the alerts that they should be reviewing in a typical workday. This potentially leads to missing the important alerts that are critical to an organization's security. Thus, choosing the right SIEM solution can be transformative for security teams, helping them manage alerts…

IBM and MuleSoft expand global relationship to accelerate modernization on IBM Power 

2 min read - As companies undergo digital transformation, they rely on APIs as the backbone for providing new services and customer experiences. While APIs can simplify application development and deliver integrated solutions, IT shops must have a robust solution to effectively manage and govern them to ensure that response times and costs are kept low for all applications. Many customers use Salesforce’s MuleSoft, named a leader by Gartner® in full lifecycle API management for seven consecutive times, to manage and secure APIs across…

IBM Newsletters

Get our newsletters and topic updates that deliver the latest thought leadership and insights on emerging trends.
Subscribe now More newsletters