What is AutoML?
Automated machine learning (AutoML) is the process of automating the manual tasks that data scientists must complete as they build and train machine learning models (ML models). These tasks include feature engineering and selection, choosing the type of machine learning algorithm; building an analytical model based on the algorithm; hyperparameter optimization, training the model on tested data sets and running the model to generate scores and findings. Researchers developed AutoML to help data scientists build predictive models without having deep ML model expertise. AutoML also frees data scientists from the rote tasks involved in building a machine learning pipeline, enabling them to focus on extracting the insights needed to solve important business problems.
What is AutoAI?
AutoAI is a variation of AutoML. It extends the automation of model building to the entire AI lifecycle. Like AutoML, AutoAI applies intelligent automation to the steps of building predictive machine learning models. These steps include preparing data sets for training; identifying the best type of model for the given data, such as a classification or regression model; and choosing the columns of data that best support the problem the model is solving, known as feature selection. Automation then tests a variety of hyperparameter tuning options to reach the best result as it generates, and then ranks, model-candidate pipelines based on metrics such as accuracy and precision. The best performing pipelines can be put into production to process new data and deliver predictions based on the model training.
The all new enterprise studio that brings together traditional machine learning along with new generative AI capabilities powered by foundation models
Automated model deployment
Model testing and scoring
Debiasing and drift mitigation
Model risk management
AI lifecycle management
Any AI models
Advanced data refinery
Automatically build machine learning and AI models without deep data science expertise. Empower data scientists, developers, ML engineers and analysts to generate top-candidate model pipelines. Tackle skill set gaps and increase productivity for your machine learning projects.
Build custom AI and machine learning models in minutes or even seconds. Experiment, train and deploy models more rapidly at scale. Increase repeatability and governance of machine learning and AI model lifecycles while reducing mundane, time-consuming tasks.
Address explainability, fairness, robustness, transparency and privacy as part of the AI lifecycle. Mitigate model drift, bias and risk in AI and machine learning. Validate and monitor models to verify that AI and machine learning performance meets business goals. Help meet corporate social responsibility (CSR) and environmental social governance (ESG).
Cut costs of AI and machine learning model operations (ModelOps) through unifying tools, processes and people. Reduce spend on managing legacy or point tools and infrastructures. Save time and resources to deliver production-ready models with automated AI and ML lifecycles.
Discover why IBM is recognized as a Leader in the 2021 Magic Quadrant for Data Science and Machine Learning
Apply various algorithms, or estimators, to analyze, clean and prepare raw data for machine learning. Automatically detect and categorize features based on data type, such as categorical or numerical. Use hyperparameter optimization to determine the best strategies for missing value imputation, feature encoding and feature scaling.
Select models through candidate algorithm testing and ranking against small subsets of the data. Gradually increase the size of the subset for the most promising algorithms. Enable ranking of a large number of candidate algorithms for model selection with the best match for the data.
Transform raw data into the combination of features that best represents the problem to achieve the most accurate prediction. Explore various feature construction choices in a structured, non-exhaustive manner, while progressively maximizing model accuracy using reinforcement learning.
Refine and optimize model pipelines using model training and scoring typical in machine learning. Choose the best model to put into production based on performance.
Integrate monitoring on model drift, fairness and quality though model input and output details, training data and payload logging. Implement passive or active debiasing, while analyzing direct and indirect bias.
Extend with model and data insights and validate if your models meet your expected performance. Continuously improve your models by measuring model quality and comparing model performance.
See the benefits gained by this bank using IBM Cloud Pak for Data to analyze data, assess data drift and measure model performance.
Learn how this healthcare network built a predictive model that uses insurance claims data to identify patients likely to develop sepsis.
Learn how this marketing communications agency uses AutoAI to drive high-volume predictions and identify new customers.
An IBM Research team is committed to applying state-of-the-art techniques from AI, ML and data management to accelerate and optimize the creation of machine learning and data science workflows. The team’s first efforts around AutoML focused on using hyperband/Bayesian optimization for hyperparameter search and hyperband/ENAS/DARTS for Neural Architecture Search.
They have continued to focus on AutoAI development, including automation of the pipeline configuration and hyperparameter optimization. A significant enhancement is the hyperparameter optimization algorithm, which is optimized for cost function evaluation such as model training and scoring. This helps to expedite convergence to the best solution.
IBM Research is also applying automated artificial intelligence to help ensure trust and explainability in AI models. With AutoAI in IBM Watson Studio, users see visualizations of each stage of the process, from data preparation, to algorithm selection, to model creation. Additionally, IBM AutoAI automates the tasks for continuous improvement of the model and makes it easier to integrate AI model APIs into applications through its ModelOps capabilities. The evolution of AutoAI within the IBM Watson Studio product contributed to IBM being named a Leader in the 2021 Gartner Magic Quadrant for Data Science and Machine Learning Platforms.
Learn how AutoAI in IBM Watson Studio helps you build and scale AI machine learning models.
Learn how to build and evaluate machine learning models by using the AutoAI feature in IBM Watson Studio.
Learn how DevOps, ModelOps and DataOps fit together.
Discover why IBM was named a Leader in the 2021 Magic Quadrant for Data Science and Machine Learning.
Deep learning is a subfield of machine learning and is known for powering AI applications and services that perform analytical and physical tasks without human intervention. Example use cases for deep learning include chatbots, medical image recognition technologies and fraud detection. However, as with machine learning, designing and running a deep learning algorithm requires a tremendous amount of human effort as well as compute power.
The IBM Research team has explored one of the most complex and time-consuming processes in deep learning: the creation of the neural architecture through a technique called neural architecture search (NAS). The team reviewed the NAS methods developed and presented the benefits of each with a goal of helping practitioners choose an appropriate method. Automating the approach to finding the best-performing architecture for a machine learning model can lead to greater democratization of AI, but the issue is complex and difficult to solve.
With the Deep Learning service within IBM Watson Studio, you can still get started with deep learning quickly. The service helps you design complex neural networks and then experiment at scale to deploy an optimized machine learning model. Designed to simplify the process of training models, the service also provides an on-demand GPU compute cluster to address compute power requirements. You can also integrate popular open source ML frameworks such as TensorFlow, Caffe, Torch and Chainer to train models on multiple GPUs and accelerate results. On IBM Watson Studio, you can combine AutoML, IBM AutoAI, and the Deep Learning service to accelerate experimentation, analyze structured and unstructured data, and deploy better models faster.
The demand for AutoML has led to the development of open source software that can be used by data science experts and non-experts. Leading open source tools include auto-sklearn, auto-keras and auto-weka. IBM Research contributes to Lale (link resides outside IBM), a Python library that extends the capabilities of scikit-learn to support a broad spectrum of automation, including algorithm selection, hyperparameter tuning and topology search. As described in a paper from IBM Research, Lale works by automatically generating search spaces for established AutoML tools. Experiments show these search spaces achieve results competitive with state-of-the-art tools while offering more versatility.