A machine learning algorithm is a set of rules or processes used by an AI system to conduct tasks—most often to discover new data insights and patterns, or to predict output values from a given set of input variables. Algorithms enable machine learning (ML) to learn.

Industry analysts agree on the importance of machine learning and its underlying algorithms. From Forrester, “Advancements in machine-learning algorithms bring precision and depth to marketing data analysis that helps marketers understand how marketing details—such as platform, creative, call to action, or messaging—impact marketing performance.^{1}” While Gartner states that, “Machine learning is at the core of many successful AI applications, fueling its enormous traction in the market.^{2}”

Most often, training ML algorithms on more data will provide more accurate answers than training on less data. Using statistical methods, algorithms are trained to determine classifications or make predictions, and to uncover key insights in data mining projects. These insights can subsequently improve your decision-making to boost key growth metrics.

Use cases for machine learning algorithms include the ability to analyze data to identify trends and predict issues before they occur.^{3} More advanced AI can enable more personalized support, reduce response times, provide speech recognition and improve customer satisfaction. The industries that particularly benefit from machine learning algorithms to create new content from vast amounts of data include supply chain management, transportation and logistics, retail and manufacturing^{4}—all embracing generative AI, with its ability to automate tasks, enhance efficiency and provide valuable insights, even to beginners.

Learn key benefits of generative AI and how organizations can incorporate generative AI and machine learning into their business.

Register for the guide on foundation models

Deep learning is a specific application of the advanced functions provided by machine learning algorithms. The distinction is in how each algorithm learns. "Deep" machine learning models can use your labeled datasets, also known as supervised learning, to inform its algorithm, but it doesn’t necessarily require labeled data. Deep learning can ingest unstructured data in its raw form (such as text or images), and it can automatically determine the set of features which distinguish different categories of data from one another. This eliminates some of the human intervention required and enables the use of larger data sets.

The easiest way to think about artificial intelligence, machine learning, deep learning and neural networks is to think of them as a series of AI systems from largest to smallest, each encompassing the next. Artificial intelligence (AI) is the overarching system. Machine learning is a subset of AI. Deep learning is a subfield of machine learning, and neural networks make up the backbone of deep learning algorithms. It’s the number of node layers, or depth, of neural networks that distinguishes a single neural network from a deep learning algorithm, which must have more than three.

A paper from UC Berkeley breaks out the learning system of a machine learning algorithm into three main parts.^{5}

**A decision process**: In general, machine learning algorithms are used to make a prediction or classification. Based on some input data, which can be labeled or unlabeled, your algorithm will produce an estimate about a pattern in the data.

**An error function**: An error function evaluates the prediction of the model. If there are known examples, an error function can make a comparison to assess the accuracy of the model.

3. **A model optimization process**: If the model can fit better to the data points in the training set, then weights are adjusted to reduce the discrepancy between the known example and the model estimate. The algorithm will repeat this “evaluate and optimize” process, updating weights autonomously until a threshold of accuracy has been met.

Supervised learning in particular uses a training set to teach models to yield the desired output. This training dataset includes inputs and correct outputs, which enables the model to learn over time. The algorithm measures its accuracy through the loss function, adjusting until the error has been sufficiently minimized.

There are four types of machine learning algorithms: supervised, unsupervised, semi-supervised, and reinforcement. Depending on your budget, need for speed and precision required, each type and variant has its own advantages. Advanced machine learning algorithms require multiple technologies—including deep learning, neural networks and natural language processing—and are able to use both unsupervised and supervised learning.^{6} The following are the most popular and commonly used algorithms.

Supervised learning can be separated into two types of problems when data mining: classification and regression.

**Classification**uses an algorithm to accurately assign test data into specific categories. It recognizes specific entities within the dataset and attempts to draw some conclusions on how those entities should be labeled or defined. Common classification algorithms are linear classifiers, support vector machines (SVM), decision trees, K-nearest neighbor and random forest, which are described in more detail below.

**Regression**is used to understand the relationship between dependent and independent variables. It is commonly used to make projections, such as sales revenue for a given business. Linear regression, logistical regression, and polynomial regression are popular regression algorithms.

Various algorithms and computations techniques are used in supervised machine learning processes, often calculated through use of programs such as Python. Supervised learning algorithms include:

**AdaBoost or gradient boosting**: Also called adaptive boosting^{7}, this technique boosts an underperforming regression algorithm by combining it with weaker ones to create a stronger algorithm that results in fewer errors. Boosting combines the forecasting power of several base estimators.

**Artificial neural networks**: Also known as ANNs, neural networks or simulated neural networks (SNNs), are a subset of machine learning techniques and are at the heart of deep learning algorithms. The learner algorithm recognizes patterns in input data using building blocks called neurons, approximating the neurons in the human brain, which are trained and modified over time. (More in “neural networks.”)

**Decision tree algorithms**: Used for both predicting numerical values (regression problems) and classifying data into categories, decision trees use a branching sequence of linked decisions that may be represented with a tree diagram. One of the advantages of decision trees is that they are easy to validate and audit, unlike the black box of a neural network.

**Dimensionality reduction**: When a selected data set has a high number of features^{7}, it has high dimensionality. Dimensionality reduction then cuts down the number of features, leaving only the most meaningful insights or information. An example is principal component analysis.

**K-nearest neighbor**: Also known as KNN, this non-parametric algorithm classifies data points based on their proximity and association to other available data. It assumes that similar data points can be found near each other. As a result, it seeks to calculate the distance between data points, usually through Euclidean distance, and then it assigns a category based on the most frequent category or average.

**Linear regression**: Linear regression is used to identify the relationship between a dependent variable and one or more independent variables and is typically leveraged to make predictions about future outcomes. When there is only one independent variable and one dependent variable, it is known as simple linear regression.

**Logistic regression**: While linear regression is leveraged when dependent variables are continuous, logistic regression is selected when the dependent variable is categorical, meaning there are binary outputs, such as "true" and "false" or "yes" and "no." While both regression models seek to understand relationships between data inputs, logistic regression is mainly used to solve binary classification problems, such as spam identification.

**Neural networks**: Primarily leveraged for deep learning algorithms, neural networks process the input training data by mimicking the interconnectivity of the human brain through layers of nodes. Each node is made up of inputs, weights, a bias (threshold) and an output. If that output value exceeds a given threshold, it “fires” or activates the node, passing data to the next layer in the network. Neural networks learn from adjustments based on the loss function through the process of gradient descent. When the cost function is at or near zero, you can be confident in the model’s accuracy.

**Naïve Bayes**: This approach adopts the principle of class conditional independence from the Bayes Theorem. This means that the presence of one feature does not impact the presence of another in the probability of a given outcome, and each predictor has an equal effect on that result. There are three types of Naïve Bayes classifiers: Multinomial Naïve Bayes, Bernoulli Naïve Bayes and Gaussian Naïve Bayes. This technique is primarily used in text classification, spam identification and recommendation systems.

**Random forests**: In a random forest, the machine learning algorithm predicts a value or category by combining the results from a number of decision trees. The "forest" refers to uncorrelated decision trees, which are assembled to reduce variance and enable more accurate predictions.

**Support vector machines (SVM)**: This algorithm may be used for both data classification and regression, but typically for classification problems, constructing a hyperplane where the distance between two classes of data points is at its maximum. This hyperplane is known as the decision boundary, separating the classes of data points (such as oranges vs. apples) on either side of the plane.

Unlike supervised learning, unsupervised learning uses unlabeled data. From that data, the algorithm discovers patterns that help solve clustering or association problems. This is particularly useful when subject matter experts are unsure of common properties within a data set. Common clustering algorithms are hierarchical, K-means, Gaussian mixture models and Dimensionality Reduction Methods such as PCA and t-SNE.

**Clustering**: These algorithms can identify patterns in data so that it can be grouped. Algorithms can help data scientists by identifying differences between data items that humans have overlooked.

**Hierarchical clustering**: This groups data into a tree of clusters^{8}. Hierarchical clustering begins by treating every data point as a separate cluster. Then, it repeatedly executes these steps: 1) identify the two clusters which can be closest together, and 2) merge the two maximum comparable clusters. These steps continue until all the clusters are merged together.

**K-means clustering**: This identifies groups within data without labels^{9}into different clusters by finding groups of data which are similar to one another. The name “K-means” come from the $k$ centroids that it uses to define clusters. A point is assigned to a particular cluster if it is closer to that cluster's centroid than any other centroid.

**Semi-supervised learning algorithms**

In this case, learning occurs when only part of the given input data has been labeled—giving the algorithm a bit of a “head start.” This approach can combine the best of both worlds^{10}—improved accuracy associated with supervised machine learning and the ability to make use of cost-effective unlabeled data, as in the case of unsupervised machine learning.

**Reinforcement algorithms
**

In this case, the algorithms are trained just as humans learn—through rewards and penalties—which are measured and tracked by a reinforcement learning agent^{11} which has a general understanding of the probability of successfully moving the score up vs. moving it down. Through trial and error, the agent learns to take actions that lead to the most favorable outcomes over time. Reinforcement learning is often used^{12} in resource management, robotics and video games.

Design complex neural networks. Experiment at scale to deploy optimized learning models within IBM Watson Studio.

Analyze data and build analytics and predictive models of future outcomes. Uncover risks and opportunities for your business.

NLP is AI that speaks the language of your business. Build solutions that drive 383 percent ROI over three years with IBM Watson Discovery.

Learn the fundamental concepts for AI and generative AI, including prompt engineering, large language models and the best open-source projects.

IBM again recognized as a Leader in the 2023 Gartner® Magic Quadrant™ for Enterprise Conversational AI.

Learn the tools businesses use to efficiently run and manage AI models and empower their data scientist with technology that can help optimize their data-driven decision making.

Explore how machine learning lets you continually learn from data and predict the future.

Four strategies to scale AI with a strong data foundation.

AI technology has been rapidly evolving over the last couple of decades. Learn how businesses are implementing AI today.

**All footnote links below reside outside of IBM.**

1 Forrester: Use Marketing Analytics To Support Your 2023 Marketing Strategy

2 Gartner: What Is Artificial Intelligence?

3 Gartner Peer Community: How will AI help facilitate desk and IT support teams?

4 IDC: Generative AI: Exploring Trends and Use Cases Across Asia/Pacific Supply Chains

5 Berkeley School of information: What Is Machine Learning (ML)?

6 Gartner Glossary: Machine Learning

7 TechTarget: What are machine learning algorithms?

8 GeeksforGeeks: Hierarchical Clustering in Data Mining

9 Stanford University: K Means

10 Booz Allen: How do machines learn?

11 G2: Reinforcement Learning: How Machines Learn From Their Mistakes

12 TechTarget: What is machine learning and how does it work?