When AI server infrastructure performance hits the wall

Share this post:

Gaining new insight to grow the business is a strong driver for adding artificial intelligence (AI) and deep learning to an organization’s IT capabilities. Organizations that don’t adopt these cognitive technologies to gain an advantage risk losing out to the competition. Many take the first step by experimenting with AI software on their existing infrastructure.

Nevertheless, at some point they are likely to “hit the wall”—that is, they run out of infrastructure performance in much the same way even a well-conditioned marathoner may run out of energy before reaching the finish line. According to a recent IDC survey, 77.1 percent of respondents say they ran into one or more limitations with their on-premises AI infrastructure (IDC White Paper, sponsored by IBM, “Hitting the wall with server infrastructure for artificial intelligence,” September 2017). And 90.3 percent of users running cognitive technology in the cloud ran into these same kinds of limitations.

AI performance challenges

AI and deep learning are extremely demanding on server infrastructure. They require powerful parallel processing, and we think investigating new solutions during the early experimental phase of AI development is critical for infrastructure teams. The same IDC survey shows that businesses take a variety of paths as they carry out experimentation. For example, some develop their solution in a virtual machine (VM) and then migrate to a dedicated server. Others start a proof of concept (PoC) on a partition of a scale-up system and then opt to move to a server cluster.

We believe that choosing the right server hardware plays a decisive role. According to the IDC white paper cited above, responses from businesses running AI applications indicate that a cluster of single and dual-socket servers with high, per-core performance and I/O parameters combined with accelerators such as GPUs are well suited as infrastructure configuration for cognitive applications. Scaling these accelerated compute nodes is not as straightforward as just scaling CPUs. As a result, businesses need to look for a server vendor that is knowledgeable about scaling for AI applications.

The right path to development

In addition to the survey results cited previously, the IDC white paper also includes analyst recommendations for AI development approaches. For small to medium-sized AI initiatives, IDC recommends developing an in-house solution to enable the infrastructure team to acquire new skill sets. As a counterpoint, IDC’s white paper goes on to report that because of the required development effort’s complexity, more comprehensive AI initiatives can benefit from external support.

Here’s the upshot: if you’re developing AI capabilities or scaling existing AI capabilities, hitting an infrastructure performance wall is likely only a matter of time. In that case, hit it in a “tightly controlled” manner, as recommended by IDC analysts. And do so not only “knowingly and in full possession of the details,” according to the IDC white paper, but also closely collaborate with a server vendor that provides business-wide early stage to advanced production to full exploitation guidance.

At IBM, we are in an excellent position to be ready to help businesses meet the performance demands of their cognitive initiatives. We offer a comprehensive AI hardware and software stack, from IBM Power Systems servers with NVIDIA GPUs to our PowerAI software framework. We also offer a wide range of support and consultation.

Comprehensive survey and recommendations

If you think your organization needs to make the move toward AI and deep learning, you can draw on the many well-defined AI use cases across industries that are applicable. Download the comprehensive IDC White Paper, sponsored by IBM, that identifies more than a dozen possibilities and the paths to get you there.

Director of Product Management, Deep Learning, IBM Cognitive Systems

Add Comment
No Comments

Leave a Reply

Your email address will not be published.Required fields are marked *

More Power Systems stories

Big bets on infrastructure innovation: Day 2 recap from Think 2018

Tuesday at IBM Think built on the excitement of Day 1, with even more amazing announcements and eye-opening stories from our clients! The day kicked off with Ginni Rometty’s Chairman’s Address, which highlighted how IBM and our clients are putting smart to work. Ginni professed that we’re at a critical inflection point at the intersection […]

Continue reading

POWER9: A chip’s trip around the world

IBM Systems introduced the first of many POWER9 servers a couple of months ago with the best platform for enterprise AI. The IBM Power System AC922 has capabilities spanning AI, deep learning and machine learning. In fact, the entire POWER9 family is evolutionary and brings data-intensive computing for whatever your business needs, no matter the […]

Continue reading

Deep learning performance breakthrough

Have you noticed that interest in artificial intelligence (AI) has really taken off in the last year or so? A lot of that interest is fueled by deep learning. Deep learning has revolutionized the way we use our phones, bringing us new applications such as Google Voice and Apple’s Siri, which are based on AI […]

Continue reading