What is predictive analytics?

Predictive analytics is a branch of advanced analytics that makes predictions about future outcomes using historical data combined with statistical modeling, data mining techniques and machine learning. Companies employ predictive analytics to find patterns in this data to identify risks and opportunities.

Predictive analytics is often associated with big data and data science. Companies today are swimming in data that resides across transactional databases, equipment log files, images, video, sensors or other data sources. To gain insights from this data, data scientists use deep learning and machine learning algorithms to find patterns and make predictions about future events. These include linear and nonlinear regression, neural networks, support vector machines and decision trees. Learnings obtained through predictive analytics can then be used further within prescriptive analytics to drive actions based on predictive insights.

IBM offers a set of software tools to help you more easily and quickly build scalable predictive models. These tools can also be run on IBM Cloud Pak® for Data, a containerized data and AI platform that enables you to build and run models anywhere — on any cloud and on premises.


A flexible platform for building predictive models


Automate data science and data engineering tasks. Train, test and deploy models seamlessly across multiple enterprise applications. Extend common data science capabilities across hybrid, multicloud environments.


Harness pre-built applications and pre-trained models. Help data science and business teams collaborate and streamline model building with state-of-the-art IBM and open source software.


Use a central platform to manage the entire data science lifecycle. Standardize development and deployment processes. Create a single framework for data governance and security across the organization.

Predictive analytics tools from IBM

Data science platform

IBM Watson® Studio helps operationalize AI by providing the tools to prepare data and build models anywhere using open source code or visual modelling.

Statistical analysis software

IBM® SPSS® Statistics is designed to solve business and research problems using ad hoc analysis, hypothesis testing, geospatial analysis and predictive analytics.

Visual modeling tool

The IBM SPSS Modeler solution can help you tap into data assets and modern applications, with complete algorithms and models that are ready for immediate use.

Decision optimization solutions

IBM Decision Optimization optimizes outcomes by offering prescriptive analytics capabilities to augment predictive insights from machine learning models.

Predictive analytics examples

Explore industry use cases


Financial services use machine learning and quantitative tools to predict credit risk and detect fraud.


Predictive analytics in health care is used to detect and manage the care of chronically ill patients.

Human resources (HR)

HR teams use predictive analytics to identify and hire employees, determine labor markets and predict an employee’s performance level.

Marketing and sales

Predictive analytics can be used for marketing campaigns throughout the customer lifecycle and in cross-sell strategies.


Retailers use predictive analytics to identify product recommendations, forecast sales, analyze markets and manage seasonal inventory.

Supply chain

Businesses use predictive analytics to make inventory management more efficient, helping to meet demand while minimizing stock.

Dive deeper on predictive analytics

Model types and more

Types of predictive modeling

Data science and analytics teams leverage three types of predictive models: predictive modeling, descriptive modeling, and decision-making modeling.

Predictive modeling
Predictive modeling uses statistics to predict outcomes. The goal is to assess the likelihood that a similar unit in a different sample will exhibit similar performance. Predictive modeling can be used to predict a customer’s behavior, such as his or her credit risk.

Descriptive modeling
Descriptive modeling describes relationships within a given dataset, and it is primarily used to classify customers or prospects into groups for segmentation purposes. This type of modeling focuses on identifying different relationships between customers and products, such as by product preferences and life stage.

Decision-making modeling
Decision-making modeling describes the relationship between elements in a decision, such as the data, the decision, and the forecasted results, to predict the results. This type can be used to maximize certain outcomes while minimizing others.

Popular predictive analytics models

Predictive analytics models are designed to assess historical data, discover patterns, observe trends, and use that information to predict future trends. Popular predictive analytics models include classification, clustering, forecast, outliers, and time series, which are described in more detail below.

Classification models
Classification models are categorized under supervised machine learning models. They place data into categories based on conclusions from the historical data. This model is commonly used for answering questions with binary outputs, such answering yes or no or true and false. Types of classification models include logistic regression, decision trees, random forest, neural networks, and Naïve Bayes.

Clustering models
Clustering models are categorized under unsupervised learning. They sort data into groups based on similar attributes. For example, an e-commerce site can use the model to separate customers into similar groups based on common features and develop marketing strategies for each group. Common clustering algorithms include k-means clustering, mean-shift clustering, density-based spatial clustering of applications with noise (DBSCAN), expectation-maximization (EM) clustering using Gaussian Mixture Models (GMM), and hierarchical clustering. 

Forecast models
Forecast models use metric value prediction, estimating numeric value for new data based on trends from historical data. For example, a call center can use the model to forecast how many calls it will receive per hour. Time series and econometric models would be examples of forecasting models.

Outliers models
Outliers models deal with anomalous data entries in a dataset. For example, insurance companies can use it for fraud detection to flag anomalous data within a list of transactions.  Some popular methods for outlier detection include extreme value analysis, probabilistic and statistical modeling, linear regression, proximity-based modeling, and information theory modeling.

Time-series models
Time-series models employ a sequence of data points using time as the input parameter. It can take the last year of data, calculate a numerical metric, and use that metric to predict the three to six weeks of data. For example, the model can be used by a hospital to make predictions about emergency room capacity based on the number of patients who showed up in the past six weeks.

Predictive analytics process

Predictive analytics begins with a business goal, such as to reduce waste, save time or cut costs. The process uses models to harness massive data sets to generate outcomes that support that goal.

As an example, the predictive analytics process for predicting sales revenue follows these basic steps.

  1. Import data from a variety of sources. These data sources include product sales, marketing budgets, and national GDP.
  2. Clean the data by removing outliers (i.e. data spikes, missing data) and aggregating. A single table could be used to aggregate different types of data, such as product sales, marketing budgets, and national GDP.
  3. Develop a predictive model based ensuring appropriate fit. For example, neural networks could be used to build and train a predictive model for the revenue forecasting.
  4. Deploy the model into a production environment, where it can be accessed through other applications.

Implementing a predictive analytics program

The use of predictive analytics is a key milestone on your analytics journey — a point of confluence where classical statistical analysis meets the new world of artificial intelligence (AI). Today’s unprecedented convergence of intuitive tools, new predictive techniques and hybrid cloud deployment models makes predictive analytics and modeling more accessible than ever before. For the first time, organizations of all sizes can have the tools to embed predictive analytics into their business processes and to harness AI at scale.

The evolution to an enterprise data science program can deliver significant competitive advantages. The typical steps in that evolution are:

Phase 1: Getting started

When a business begins building its data science capabilities, it usually starts with ad hoc projects, such as developing models to answer specific questions or support research projects. With solutions such as IBM Watson Studio Desktop, data scientists can work 24x7 on their own computers or laptops and sync up with a wider team when needed.

Phase 2: Adoption growth

As data science is adopted more widely across the business, different departments need to deploy their models, connect them to data sources and infuse them into production applications. IBM Watson Studio and IBM Watson Machine Learning make it easier for departmental data science and IT teams to collaborate across this lifecycle.

Phase 3: Enterprise-scale adoption

Once AI is embedded into business-critical processes, organizations need to build a central platform to manage and govern models and data. IBM Cloud Pak for Data can provide the infrastructure and tools required for a comprehensive, multicloud platform that acts as a single point of control.

Get hands-on experience

Code patterns and tutorials

Create and deploy a scoring model to predict heartrate failure

Predict equipment failure using IoT sensor data

Analyze open medical datasets to gain insights

Perform a machine learning exercise

Shape and refine raw data for predictive analysis

Create a scoring algorithm

Get started

Begin your predictive analytics journey