Build Machine Learning models with Minimal Effort with AutoML

Share this post:

The research and development of machine learning (ML) across various domains, such as computer vision, natural language processing, and speech transcription has been growing rapidly. ML is important in many enterprise applications across industries such as manufacturing, finance, healthcare, and marketing. With the growing adoption of ML, it is increasingly challenging to design or choose the specific model/neural network architecture for each ML task and identify a good set of hyper-parameters for training the ML model.  As a result, the process relies significantly on the advanced expertise of data scientists. These challenges have motivated growing interest in techniques that automatically discover tailored network architectures for training without human intervention, which is referred to as AutoML as shown in Figure 1.


AutoML pipeline [1]

The AutoML system includes a meta-learning component, which leverages historical information to do faster and better search for models for training.  The ML framework is responsible for automatically choosing data processor and feature preprocessor algorithms and selecting the models.  In Deep Learning (DL), this model selection is typically replaced by Neural Architecture Search–NAS).  The ensemble module does additional post-processing, such combining multiple models together.

As summarized in Figure 2, there has been considerable recent study on AutoML using methods based on genetic algorithms, random search, Bayesian optimization, reinforcement learning and continuous differentiable methods.  The research blog [2] provides more details on Neural Architecture Search.


AutoML methods summary

Our initial efforts for AutoML have focused on using hyperband/bayesian optimization for hyper-parameter search and hyperband/ENAS/DARTS for Neural Architecture Search.  This research has been done in the context of the IBM product Watson Machine Learning Accelerator (WML-A) and has been integrated with IBM Research services for NeuNetS and AutoAI.

It is well known that most existing AutoML approaches require considerable overhead for model searching. To improve efficiency, we present a transferable AutoML method that leverages previously trained models to speed up the search process for new tasks and datasets. Our approach involves a novel meta-feature extraction technique based on the performance of benchmark models, and a dynamic dataset clustering algorithm based on Markov process and statistical hypothesis test.  We recently published some of our work on this in CVPR 2019 [3]. The workflow is shown in Figure 3. As such, multiple models can share a common structure with different learned parameters.


Fig. 3: Pipeline of transferable AutoML

Our method enjoys some flexibilities against existing methods in three dimensions:

  • Search algorithms. Unlike multi-task solutions designed for Bayesian optimization, or transfer learning with AutoML based on reinforcement learning, our approach can be easily combined with most existing AutoML techniques in an out-of-box fashion.
  • Search mechanisms. Our transferable AutoML can be applied to different search schemes: search from scratch; search from predefined models (e.g. reuse GoogleNet architecture and weights of bottom layers to search an architecture for higher layers) and transfer from basic cells (transfer the searched normal/reduction cell of source datasets to target datasets). This feature makes it more flexible to handle datasets under limited time budget.
  • Online setting. Our method can be used to the online setting whereby the datasets come sequentially, and one needs to search model for the new arrival datasets efficiently.

We evaluated our technology in three search mechanisms: search from scratch (shown in Table 1), search from predefined models (shown in Table 2), and transfer from basic cells (shown in Table 3) according to the difficulties of the given datasets. The experimental results on image classification show notable speedup (3x-10x times on average in overall search time for multiple datasets) with negligible loss in accuracy.

Table 1: Total search time (in days, including the overhead of running benchmark models), total classification relative errors (TRE) and benchmark overhead on 7 datasets: in model search from scratch setting.

Table 2: Total search time (in days, including the overhead of running benchmark models), total classification relative errors (TRE) and benchmark overhead on 7 datasets: in model search from scratch setting.

Table 3: Search time (in days, including overhead). Test accuracy: in transfer from basic cells setting.

For the next step, we will explore the end to end neural architecture search for object detection and 3D object detection; neural AutoML for tabular data including vanilla tabular data and spatial-temporal tabular data; and a new aspect of differential-methods based neural architecture search.
Some of our capabilities are in the WML-A that can be found in the free trial link:


[1] K. E. e. a. M. Feurer, A. Klein. Efficient and robust automated machine learning. In NIPS, 2015.
[3] Chao Xue, Junchi Yan, Rong Yan, Yonggang Hu at el. Transferable AutoML by Model Sharing Over Grouped Datasets. In CVPR, 2019.

Staff Member, IBM Research-China

Xi Xia

Research Staff Member

Zhihu Wang

IBM Research – China

More AI stories

New research helps make AI fairer in decision-making

To tackle bias in AI, our IBM Research team in collaboration with the University of Michigan has developed practical procedures and tools to help machine learning and AI achieve Individual Fairness. The key idea of Individual Fairness is to treat similar individuals well, similarly, to achieve fairness for everyone.

Continue reading

IBM researchers investigate ways to help reduce bias in healthcare AI

Our study "Comparison of methods to reduce bias from clinical prediction models of postpartum depression” examines healthcare data and machine learning models routinely used in both research and application to address bias in healthcare AI.

Continue reading

Pushing the boundaries of human-AI interaction at IUI 2021

At the 2021 virtual edition of the ACM International Conference on Intelligent User Interfaces (IUI), researchers at IBM will present five full papers, two workshop papers, and two demos.

Continue reading