Jump-start your AI adoption with AutoML on IBM Power Systems

By | 3 minute read | March 24, 2020

Are you eager to adopt AI for your business but bogged down by the complexity of machine learning algorithms, the plethora of software technologies and the dearth of personnel with specialized skills? If so, you’re not alone! Even seasoned technologists are overwhelmed by this vast and fragmented AI ecosystem.

Automatic machine learning (AutoML) refers to the methodology of automatically building machine learning pipelines with minimal engineering input, enabling non-specialists to build AI solutions from raw data. AutoML and the associated software offer a promising direction to reduce the complexity in AI development, at least for regular applications, and boost the adoption of AI in enterprises. Read on to learn more about AutoML and why IBM Power Systems is ideal for deploying AutoML frameworks.

The automatic machine learning paradigm

Most often, complete AI solutions are built by teams with complimentary expertise in statistics, machine learning, programming, data engineering and vertical domain. The teams iterate through a complex process involving data preprocessing, feature engineering, machine learning algorithm selection, hyperparameter optimization and validation. This process is typically encapsulated in the form of an end-to-end machine learning pipeline, as shown in figure 1. AutoML aims to automate one or more of these pipeline components based upon the data characteristics and pre-built features (meta learning), and by using esoteric machine learning methods such as Bayesian optimization, genetic programming and reinforcement learning (AI for AI).

data processing to validation - progression

Figure 1. Typical end-to-end machine learning pipeline built by AutoML

There is growing interest in AutoML in industry and academia, as evidenced by the recent surge of commercial software (such as IBM AutoAI, H2O Driverless AI) and open source implementations (such as featuretools, auto-sklearn, auto-WEKA, TPOT, AutoKeras and auto-PyTorch). These frameworks functionally differ in the length of the machine learning pipeline and the quantitative methods employed to automate each component. They are capable of automatically building end-to-end machine learning pipelines fitting classical machine learning algorithms as well as deep neural networks (IBM Visual Insights, for example) for industry-scale supervised learning problems.

AutoML is compute bound

Loosely speaking, AutoML is solving a giant optimization problem, treating feature engineering, data processing and machine learning algorithmic choices as yet other dimensions in the hyperparameter space. Benchmark studies have not been able to conclude that any one framework is superior to another in all dimensions. Thus, the accuracies that can be achieved using AutoML are sensitive to the compute time allocated for search and the underlying hardware capabilities.

It is no exaggeration to say that IBM Power Systems AC922 servers are the most suitable hardware infrastructure for AutoML frameworks. After all, they’re the building blocks for the world’s most powerful supercomputers, the US Department of Energy’s Summit and Sierra systems. The major open source AutoML frameworks and H2O Driverless AI can leverage the exclusive compute capabilities of Power Systems, as depicted in figure 2. The Anaconda distribution makes the installation of open source tools a breeze on Power hardware.

Figure 2. The AutoML tools take advantage of Power hardware

Figure 2. The AutoML tools take advantage of Power hardware

Human in the loop

Though AutoML tools can engineer the machine learning pipelines for routine machine learning tasks, that alone is insufficient for building enterprise-grade AI systems. The data cleaning and preprocessing capabilities of the current AutoML tools are quite limited. The wealth of knowledge brought by domain experts from years of experience is unparalleled in data processing and feature engineering. In addition, AutoML tools are prone to overfitting, making engineering intervention necessary.

IBM Systems Lab Services can help jump-start AI adoption in your organization through rapid prototyping and fail fast methodology. Our experienced consultants help you make the right choices of machine learning tools, work with your domain experts in designing features and train your engineers to quickly become data scientists.

Contact Lab Services today.