menu icon

Deep Learning

Deep learning simulates the human brain, enabling systems that learn to identify objects and perform complex tasks with increasing accuracy—all without human intervention.

What is deep learning?

Deep learning is a subset of machine learning in which multi-layered neural networks—modeled to work like the human brain—'learn' from large amounts of data. Within each layer of the neural network, deep learning algorithms perform calculations and make predictions repeatedly, progressively 'learning' and gradually improving the accuracy of the outcome over time.

In the same way that the human brain absorbs and processes information entering the body through the five senses, deep learning ingests information from multiple data sources and analyzes it in real time.

Deep learning drives many artificial intelligence (AI) applications and services that improve automation, performing analytical and physical tasks without human intervention. Deep learning technology lies behind everyday products and services (such as digital assistants, voice-enabled TV remotes, and credit card fraud detection) as well as emerging technologies (such as self-driving cars).

Deep learning vs. machine learning

If deep learning is a subset of machine learning, how do they differ? In the simplest terms, what sets deep learning apart from the rest of machine learning is the data it works with and how it learns.

While all machine learning can work with and learn from structured, labeled data, deep learning can also ingest and process unstructured, unlabeled data. Instead of relying on labels within the data to identify and classify objects and information, deep learning uses a multi-layered neural network to extract the features from the data and get better and better at identifying and classifying data on its own. 

For example, the voice-to-text applications of a decade ago (which users had to train by speaking scores of words to the application and, in the process, label their own voice data) are examples of machine learning. Today’s voice recognition applications (including Apple's Siri, Amazon Alexa, and Google Assistant), which can recognize anyone’s voice commands without a specific training session, are examples of deep learning.

In more technical terms, while all machine learning models are capable of supervised learning (requiring human intervention), deep learning models are also capable of unsupervised learning. They can detect previously undetected features or patterns in data that aren't labeled, with the barest minimum of human supervision. Deep learning models are also capable of reinforcement learning—a more advanced unsupervised learning process in which the model 'learns' to become more accurate based on positive feedback from previous calculations.

For a deeper dive on the nuanced differences between the different technologies, see "AI vs. Machine Learning vs. Deep Learning vs. Neural Networks: What’s the Difference?"

How deep learning works

Deep learning neural networks (called deep neural networks) are modeled on the way scientists believe the human brain works. They process and reprocess data, gradually refining the analysis and results to accurately recognize, classify, and describe objects within the data.

Deep neural networks consist of multiple layers of interconnected nodes, each of which uses a progressively more complex deep learning algorithm to extract and identify features and patterns in the data. They then calculate the likelihood or confidence that the object or information can be classified or identified in one or more ways.

The input and output layers of a deep neural network are called visible layers. The input layer is where the deep learning model ingests the data for processing, and the output layer is where the final identification, classification, or description is calculated.

In between the input and output layers are hidden layers where the calculations of each previous layer are weighted and refined by progressively more complex algorithms to zero in on the final outcome. This movement of calculations through the network is called forward propagation.

Another process called backpropagation identifies errors in calculated predictions, assigns them weights and biases, and pushes them back to previous layers to train or refine the model. Together, forward propagation and backpropagation allow the network to make predictions about the identity or class of the object while learning from inconsistencies in the outcomes. The result is a system that learns as it works and gets more efficient and accurate over time when processing large amounts of data.

The above describes the simplest type of deep neural network in the simplest terms. In practice, deep learning algorithms are incredibly complex. And many complex deep learning methods and models have been developed to solve certain types of problems, including the following examples:

  • Convolutional neural networks (CNNs), used primarily in computer vision applications, can detect features and patterns within a complex image and, ultimately, recognize specific objects within the image. In 2015, a CNN bested a human in an object recognition challenge for the first time.
  • Recurrent neural network (RNNs) are used for deep learning models in which features and patterns change over time. Instead of ingesting and outputting data snapshots, RNNs ingest and output sequences of data. RNNs drive emerging applications such as speech recognition and driverless cars.

One of the best ways to improve your understanding of deep learning networks is through video that illustrates the way data moves through various deep learning models. An excellent series of videos is available here.

Deep learning applications

Real-world deep learning applications are a part of our daily lives, but in most cases, they are so well-integrated into products and services that users are unaware of the complex data processing that is taking place in the background. Some of these examples include the following:

Law enforcement

Deep learning algorithms can analyze and learn from transactional data to identify dangerous patterns that indicate possible fraudulent or criminal activity. Speech recognition, computer vision, and other deep learning applications can improve the efficiency and effectiveness of investigative analysis by extracting patterns and evidence from sound and video recordings, images, and documents, which helps law enforcement analyze large amounts of data more quickly and accurately.

Financial services

Financial institutions regularly use predictive analytics to drive algorithmic trading of stocks, assess business risks for loan approvals, detect fraud, and help manage credit and investment portfolios for clients.

Customer service

Many organizations incorporate deep learning technology into their customer service processes. Chatbots—used in a variety of applications, services, and customer service portals—are a straightforward form of AI. Traditional chatbots use natural language and even visual recognition, commonly found in call center-like menus. However, more sophisticated chatbot solutions attempt to determine, through learning, if there are multiple responses to ambiguous questions. Based on the responses it receives, the chatbot then tries to answer these questions directly or streamline the dialogue transition to a human user.

Virtual assistants like Apple's Siri, Amazon Alexa, or Google Assistant add a third dimension to the chatbot concept by combining deep learning capabilities with the underlying technology. These data science innovations allow for speech recognition and customized responses, resulting in a personalized experience for the users.


The healthcare industry has benefited greatly from deep learning capabilities ever since the digitization of hospital records and images. Image recognition applications can support medical imaging specialists and radiologists, helping them analyze and assess more images in less time.

Deep learning hardware requirements

Deep learning requires a tremendous amount of computing power. High performance graphical processing units (GPUs) are ideal because they can handle a large volume of calculations in multiple cores with copious memory available. However, managing multiple GPUs on-premises can create a large demand on internal resources and be incredibly costly to scale.

Cloud-based field-programmable gate arrays (FPGA) are an elegant solution for offloading big data and machine learning resources to a cloud service provider. FGPAs can accelerate deep learning network performance to help reduce latency when using with connected services. And FPGA instances can be deployed and decommissioned on demand, for cost-saving scalability and elasticity.

Deep learning and IBM

For decades now, IBM has been a pioneer in the development of AI technologies and deep learning, highlighted by the development of IBM Watson. One of the earliest accomplishments in deep learning technology, Watson is now a trusted solution for enterprises looking to apply advanced natural language processing and machine learning techniques to their systems using a proven tiered approach to AI adoption and implementation.

Watson uses the Apache Unstructured Information Management Architecture (UIMA) framework and IBM’s DeepQA software to make powerful deep learning capabilities available to applications. Utilizing tools like IBM Watson Studio and Watson Machine Learning, your enterprise can seamlessly bring your open-source AI projects into production while deploying and running your models on any cloud.

For more information on how to get started with deep learning technology, explore IBM Watson Studio.

Sign up for an IBMid and create your IBM Cloud account.