How does machine learning work?

Machine learning follows a process of preparing data, training an algorithm and generating a machine learning model, and then making and refining predictions.

Preparing the data

Machine learning requires data that is analyzed, formatted and conditioned to build a machine learning model. Judith Hurwitz and Daniel Kirsch, authors of Machine Learning For Dummies, advise that “machine learning requires the right set of data that can be applied to a learning process.” Data preparation typically involves these tasks:

  • Select a sample subset of data. Make and track assumptions about the data to select attributes germane to the problem you want the algorithm to train for or solve. For example, filter or focus on types of product or customer data and eliminate data about where a product was manufactured.
  • Merge or join data sets to aggregate records. Merging simplifies the data and makes it easier to manage. For example, if there is a customer data set and a customer purchases data set, they could be condensed into a new, simpler, attribute for spending for the product.
  • Format and sort the data for modeling. Choose the format: flat file or relational database for example. Certain algorithms may require data to be sorted in a specific way. For example, fields for customers may be grouped by where the customer purchased or where they live. These textual, location fields may need to be given numbers and sorted numerically.
  • Clean the data by removing or replacing any blank or missing values. There are statistical analysis tools that can help inspect the data for errors and deviations. The goal is to ensure that data is exact, complete and relevant.
  • Normalize the data or adjust values that are measured on different scales to a common scale. For example, one data set may score numerically and another by a percentage. To compare the data, the values must be normalized to a common scale.

Training the algorithm

Machine learning uses the prepared data to train a machine learning algorithm. An algorithm is a computerized procedure or recipe. When the algorithm is trained on the data, a machine learning model is generated. Selecting the right algorithm is essential to applying machine learning successfully. Selection is largely influenced by the application and the data available. But there are some commonly used algorithms and applications:

  • Regression algorithms
    Linear and logistic regression are examples of regression algorithms used to understand relationships in data. Linear regression is used to predict the value of a dependent variable based on the value of an independent variable. Logistic regression can be used when the dependent variable is binary in nature, A or B. With linear regression, for example, a salesperson’s annual sales (the dependent variable) can be determined by its relationship to independent variables such as education or years of experience (the independent variables.)
  • Decision trees
    Decision trees use classification to make recommendations based on a set of decision rules. For example, betting on a horse to win, place or show could use data about the horse (age, winning percentage, pedigree) and the decision tree would apply rules to those factors to recommend an action or decision.
  • Instance-based algorithms
    A good example of an instance-based algorithm is K-Nearest Neighbor or k-nn. It uses classification to estimate how likely a data point is to be a member of one group or another based on its proximity to other data points.
  • Clustering algorithms
    Think of clusters as groups. Clustering focuses on identifying groups of similar records and labeling the records according to the group to which they belong. This is done without prior knowledge about the groups and their characteristics. Types of clustering algorithms include the K-means, TwoStep and Kohonen clustering.

Predicting and refining

Once the data is prepared and the algorithm trained, the machine learning model can make determinations or predictions about the data — on its own. For example:

Consider a data set that has two basic values for cars: weight and speed. Values can be plotted on a graph that shows light cars tend to be fast and heavy cars tend to be slow.

When the machine learning model is provided with data about cars, it uses the algorithm to determine or predict whether a car will tend to be fast or slow, or light or heavy. It does this without explicit human intervention. And the more data provided, the more the model learns and improves the accuracy of its predictions.

Types of machine learning

Machine learning models fall into the following basic categories:

Supervised machine learning

Supervised machine learning uses sample data that is well classified and labeled. It’s supervised because it involves a set of feedback data that indicates whether the predictions based on the sample data are correct or incorrect. For example, a computer vision model can use clearly labeled (or tagged) and classified animal images to identify baboons. Because the data is well defined and feedback is provided, the model is able to refine its predictions based on the supervisory feedback telling it whether it is identifying baboons correctly or not.

Unsupervised machine learning

Unsupervised machine learning uses unlabeled data, usually in large amounts. Think of social media applications like Twitter or Instagram that generate vast amounts of unlabeled, unstructured data. Unsupervised learning algorithms can help gain meaningful information from this type of data by classifying it based on patterns or clusters. There is no feedback data to indicate whether classifications are correct or incorrect because the objective is to develop the classifications based on structures hidden in the data. A good example is email spam detection. Unsupervised learning can analyze an immense amount of emails, uncover patterns and classify them as legitimate or spam — without human intervention. Imagine how long it would take human analysts to do the same — and who would want to?

Reinforcement machine learning

Reinforcement machine learning is a behavioral learning model that is similar to supervised learning, but the algorithm isn’t trained using sample data. This model learns as it goes using trial and error. A sequence of successful outcomes will be reinforced to develop the best recommendation or policy for a given problem. The IBM Watson® system that won the Jeopardy! challenge in 2011 makes a good example. The system used reinforcement learning to decide whether to attempt an answer (or question as it were), which square to select on the board and how much to wager — especially on daily doubles

Deep learning

Deep learning is a method of machine learning that incorporates layers of neural networks. A neural network is an example of supervised learning. Think of deep learning as layers of machine learning. Deep learning models are designed to emulate how the human brain works. They typically require large amounts of data on which the successive layers run iterations and continually adjust and improve outcomes. To return to the computer vision example, a deep learning model can teach itself to identify images, emulating human vision that has been trained over a lifetime of seeing and understanding visual information.

Cloud machine learning

Hybrid cloud machine learning tools like IBM Watson® Machine Learning enable businesses and developers to quickly implement, integrate and scale machine learning technology. They are cloud-based services that can ease demand on computing and development resources. Users connect to the services through application programming interfaces (API) and use them to integrate machine learning into their applications.

Why is machine learning important?

One reason that machine learning is important is its growing prevalence in society and everyday life. Examples abound:

  • Recommendations of what to watch on Amazon Prime Video and Netflix
  • Ads and messages that appear online
  • Voice assistants like Siri or Alexa
  • The emergence of self-driving vehicles
  • Character recognition and facial detection, just to name a few

The same capabilities that help Amazon know what you like to watch — and Siri recommend where to eat — are influencing how businesses, governments and other organizations operate and perform.

“The value is straightforward,” explain Hurwitz and Kirsch, “if you use the most appropriate and constantly changing data sources in the context of machine learning, you have the opportunity to predict the future.” They highlight the need for a high volume of quality data as a condition for machine learning’s value. Fortunately, there is plenty of it —  some 2.5 quintillion bytes produced each day. Machine learning’s ability to learn from data helps human analysts and decision makers:

  • Consume and better understand big data
  • Discover relationships in data to inspire insights and create opportunities
  • Identify anomalies and solve problems
  • Anticipate outcomes and make better decisions

Machine learning applications

Machine learning uses three basic capabilities:

  • Classification — dividing objects into two or more classes
  • Regression — discovering relationships between variables
  • Clustering — grouping objects by similar characteristics

Applying these capabilities is often the domain of data science and data scientists. As data explodes, the need for professionals focused on harnessing and gaining value from vast volumes of data has become critical. Data scientists are working with other business professionals, from software developers to marketing specialists, to apply machine learning techniques in innovative ways:

Natural language processing

Natural language processing (NLP) uses machine learning text classification to enable computers to understand human language the way people do. It supports applications like sentiment analysis in social media, information extraction from text volumes too large for humans to efficiently analyze, and speech recognition applications like the front ends of GPS navigation systems. AI consultant Max Kelsen used natural language processing to help a government agency identify citizen concerns from millions of records with 97 percent accuracy.

Image recognition and computer vision

Image recognition and computer vision use advanced deep learning and neural network technologies to help computers “see.” These machine learning-based techniques can be found in everything from optical character readers to autonomous vehicles. They can be found in more entertaining, but still data-rich, settings too. IBM used computer vision and machine learning to personalize the Masters for golf fans  and pick out the best highlights at Wimbledon and the US Open for tennis fans.

Cyber security

Machine learning cyber security and artificial intelligence (AI) are used in applications to safeguard data and resist, minimize or prevent cyber attacks. Machine learning can help extract intelligence from reports, blogs and alerts to surface critical threats, advise analysts and accelerate response. IBM helped a cyber security operations center reduce threat investigation and root cause determination from three hours to three minutes. IBM Research is actually developing machine learning malware to better understand how cybercriminals are using AI to their advantage.

Predictive analytics

Predictive analytics takes advantage of machine learning’s ability to classify, find relationships and group similar characteristics to generate forecasts and support human decision making. Red Eléctrica de España, an energy supplier on the isolated Canary Islands, is working with machine learning and predictive analytics to forecast weather patterns and make critical decisions about energy supply and demand.

Marketing and chatbots

Machine learning marketing applications range from enhancing customer experience to targeting revenue streams to developing new products. A prime example is how machine learning chatbots are popping up all over web. Chatbots and other virtual assistants use machine learning to understand natural language and select optimum responses to customer and prospect queries. Software creator Go Moment used IBM Watson machine learning capabilities to build Ivy, a smart texting solution. Ivy enables guests and hotels to communicate seamlessly, capitalizing on staff expertise and cognitive intelligence to deliver a better guest experience.

How to build a chatbot for your business

Build, deploy, and optimize chatbots quickly and efficiently.

Machine learning for industries

 Machine learning is used across a range of industries.

  • Financial Services and Banking. Banks and financial services firms are looking to machine learning technologies to make sense of their data to uncover investment opportunities, predict the ability of applicants to repay loans, gain advantages in customer service, and help detect and prevent fraud.
  • Healthcare. Machine learning is being used in healthcare to conduct patient data analysis, gain insights into diagnosis and treatments, and achieve cost reduction.
  • Medicine. Machine learning supports medical diagnosis, radiology, drug discovery and imaging efforts, often with the application of computer vision and image recognition.
  • Education. Machine learning in education is part of AI applications that can personalize content for students, enhance early childhood vocabulary development, deliver AI-based tutoring and more.
  • Manufacturing. Machine learning in manufacturing is becoming more mainstream and driving Industry 4.0. It often takes advantage of cameras, sensors and other Internet of Things (IoT) technologies to detect flaws and improve quality and efficiency.
  • Economics. IBM and The Atlantic report that economists, data scientists and analysts are using machine learning to build models that are bigger, more inclusive and diverse to improve economic forecasts and analysis.
  • Retail. Machine learning helps retailers improve the customer experience by personalizing interactions. It is also applied to optimize forecasting and supply chain planning.
  • Automotive. In the automotive industry machine learning helps optimize the supply chain, accelerate manufacturing, use IoT to detect defects in production lines and more. It is also a key capability in the development of self-driving cars and other autonomous vehicles.

A brief history of machine learning

The coining of the term machine learning is credited to Arthur Samuel in 1959: “Field of study that gives computers the ability to learn without being explicitly programmed.” Professor Samuel was an AI pioneer and employee at IBM. He developed a program that learned how to play checkers better than he could.

Around that time, AI research was concentrated on what is referred to as strong AI — enabling computers to function or perform tasks as a human would. Machine learning became prominent in the 1980s when, after years of showing little progress with strong AI, the focus shifted to narrower problems. Machine learning was seen as a good approach to building models that could perform reliable predictions within specific domains.

Deep learning and the exploitation of neural networks in machine learning appeared in the 2000s, empowering machine learning to address more complex problems in a wider range of domains. Presently, machine learning is enabling systems to naturally interact with humans and is part of cognitive computing, which IBM defines as systems that learn at scale, reason with purpose and naturally interact with humans.

Future of machine learning

The future of machine learning can be brought into perspective through the key ingredient of data.

“Faced with a constant onslaught of data, we needed a new type of system that learns and adapts, and we now have that with AI,” says Arvind Krishna, Senior Vice President of Cloud and Cognitive Software, IBM Research. “What was deemed impossible a few years ago is not only becoming possible, it’s very quickly becoming necessary and expected.”

The ability to harness and learn from data is part of a three-point equation for future innovation that includes:

  • Big data — as it floods in from the web, IoT, smartphones, transactions and almost countless other sources
  • Powerful computing processors like graphical processing units (GPUs) that can process the data
  • Machine learning and deep learning as means to harvest the data, gain insight and drive innovation

Looking forward, experts from IBM and the AI fields, see a different machine learning and deep learning emerging from this equation:

“While deep learning is here to stay, it will likely look different in the next wave of AI breakthroughs. Experts stress the need to become much more efficient at training deep learning models to apply them at scale across increasingly more complex and diverse tasks. The path to this efficiency will be led in part by ‘small data’ and the use of more unsupervised learning.”

A focus on small data will help alleviate cost and time constraints associated with big data, while still delivering confident predictions. It promises to flip the data variable “on its head” with “small datasets overtaking big data as drivers of new AI innovation.”

Unsupervised learning reduces the need for labeling data, which is laborious for humans and can introduce human error. Unsupervised learning, however, has its drawbacks, chiefly its limitations in addressing practical applications. The experts say: “The next wave of AI innovation will likely be fueled by deep learning models trained using a method that lies somewhere between supervised and unsupervised learning.”

Challenges of machine learning

Successfully applying machine learning has its challenges. Here are a few to look out for:

  • Bias can be introduced into machine learning through the data it's trained on and can distort predictions or outcomes. For example, a computer vision model that is developed to support both day and night time applications, would be biased (and ineffective) if it were trained only on day time images. Bias can include gender, racial, and other forms of unwanted discrimination, too. There are toolkits available to help avoid bias, like the AI Fairness 360 Open Source Toolkit. IBM Watson OpenScale™ can also detect and mitigate bias in AI models during production to help ensure fair outcomes.
  • Black box is a phenomenon that describes when human decision makers lack confidence in machine learning predictions or recommendations because they don’t understand how the model reached those conclusions. They don’t trust it. This can be avoided with tools such as a human-in-the-loop model or IBM Watson OpenScale to explain AI outcomes in simple business language.
  • Overfitting and underfitting, in simple terms, refer to when a machine learning model either learns too much — includes too much information from the data and distorts outcomes — or learns too little — doesn’t include enough data for the model to be reliable. Avoiding overfitting and underfitting depends a great deal on the application but can usually be traced to how the data is prepared and which model is used.

Learn more about machine learning

Machine Learning for Dummies

Cover the basics, make sense of machine learning algorithms and build a data science team.

AI and machine learning courses

Take courses on everything from AI concepts to building, deploying and managing machine learning models.

AI Research

Learn more about IBM research into AI and machine learning.

Machine learning videos

Discover how you can create, evaluate and deploy a machine learning model without writing code in this simple walkthrough.

Developer resources for AI and machine learning

Get started with code patterns, tools and resources to accelerate your AI development.

IBM Data Science Community

Read about data science topics like machine learning and get involved in data science discussions and events.

Machine learning solutions

IBM Watson Machine Learning

By simplifying, accelerating and governing AI deployments, Watson Machine Learning helps organizations harness machine learning, deep learning and decision optimization to deliver business value.

IBM Watson Studio

An on-premises, private or public cloud solution that provides a collaborative machine-learning platform for teams to explore, model and deploy data solutions, using the top open-source tools.

IBM Machine Learning for z/OS®

An on-premises machine-learning solution that extracts hidden value from enterprise data. Quickly ingest and transform data to create, deploy and manage high-accuracy self-learning models, using IBM Z® data.

IBM Watson Explorer

A machine learning-powered content analytics and cognitive search platform that provides users with access to actionable insights from data.

IBM SPSS® Modeler

A graphical analytics platform for users of all skill levels to deploy insights at scale with a wide range of algorithms and capabilities such as text analytics, geospatial analysis and optimization.

AI Consulting

By creating intelligent workflows that utilize AI and data, enhance both employee and customer experiences.

IBM Watson OpenScale

Manage production AI with trust and confidence in outcomes.