November 12, 2020 By Eda Kavlakoglu 4 min read

While natural language processing (NLP), natural language understanding (NLU), and natural language generation (NLG) are all related topics, they are distinct ones. At a high level, NLU and NLG are just components of NLP. Given how they intersect, they are commonly confused within conversation, but in this post, we’ll define each term individually and summarize their differences to clarify any ambiguities.

What is natural language processing?

Natural language processing, which evolved from computational linguistics, uses methods from various disciplines, such as computer science, artificial intelligence, linguistics, and data science, to enable computers to understand human language in both written and verbal forms. While computational linguistics has more of a focus on aspects of language, natural language processing emphasizes its use of machine learning and deep learning techniques to complete tasks, like language translation or question answering. Natural language processing works by taking unstructured data and converting it into a structured data format. It does this through the identification of named entities (a process called named entity recognition) and identification of word patterns, using methods like tokenization, stemming, and lemmatization, which examine the root forms of words. For example, the suffix -ed on a word, like called, indicates past tense, but it has the same base infinitive (to call) as the present tense verb calling.

While a number of NLP algorithms exist, different approaches tend to be used for different types of language tasks. For example, hidden Markov chains tend to be used for part-of-speech tagging. Recurrent neural networks help to generate the appropriate sequence of text. N-grams, a simple language model (LM), assign probabilities to sentences or phrases to predict the accuracy of a response. These techniques work together to support popular technology such as chatbots, or speech recognition products like Amazon’s Alexa or Apple’s Siri. However, its application has been broader than that, affecting other industries such as education and healthcare.

What is natural language understanding?

Natural language understanding is a subset of natural language processing, which uses syntactic and semantic analysis of text and speech to determine the meaning of a sentence. Syntax refers to the grammatical structure of a sentence, while semantics alludes to its intended meaning. NLU also establishes a relevant ontology: a data structure which specifies the relationships between words and phrases. While humans naturally do this in conversation, the combination of these analyses is required for a machine to understand the intended meaning of different texts.
Our ability to distinguish between homonyms and homophones illustrates the nuances of language well. For example, let’s take the following two sentences:

  1. Alice is swimming against the current.
  2. The current version of the report is in the folder.

In the first sentence, the word, current is a noun. The verb that precedes it, swimming, provides additional context to the reader, allowing us to conclude that we are referring to the flow of water in the ocean. The second sentence uses the word current, but as an adjective. The noun it describes, version, denotes multiple iterations of a report, enabling us to determine that we are referring to the most up-to-date status of a file.

These approaches are also commonly used in data mining to understand consumer attitudes. In particular, sentiment analysis enables brands to monitor their customer feedback more closely, allowing them to cluster positive and negative social media comments and track net promoter scores. By reviewing comments with negative sentiment, companies are able to identify and address potential problem areas within their products or services more quickly.

What is natural language generation?

Natural language generation is another subset of natural language processing. While natural language understanding focuses on computer reading comprehension, natural language generation enables computers to write. NLG is the process of producing a human language text response based on some data input. This text can also be converted into a speech format through text-to-speech services.

NLG also encompasses text summarization capabilities that generate summaries from in-put documents while maintaining the integrity of the information. Extractive summarization is the AI innovation powering Key Point Analysis used in That’s Debatable.

Initially, NLG systems used templates to generate text. Based on some data or query, an NLG system would fill in the blank, like a game of Mad Libs. But over time, natural language generation systems have evolved with the application of hidden Markov chains, recurrent neural networks, and transformers, enabling more dynamic text generation in real time.

As with NLU, NLG applications need to consider language rules based on morphology, lexicons, syntax and semantics to make choices on how to phrase responses appropriately. They tackle this in three stages:

  • Text planning: During this stage, general content is formulated and ordered in a logical manner.
  • Sentence planning: This stage considers punctuation and text flow, breaking out content into paragraphs and sentences and incorporating pronouns or conjunctions where appropriate.
  • Realization: This stage accounts for grammatical accuracy, ensuring that rules around punctation and conjugations are followed. For example, the past tense of the verb run is ran, not runned.

NLP vs NLU vs. NLG summary

  • Natural language processing (NLP) seeks to convert unstructured language data into a structured data format to enable machines to understand speech and text and formulate relevant, contextual responses. Its subtopics include natural language processing and natural language generation.
  • Natural language understanding (NLU) focuses on machine reading comprehension through grammar and context, enabling it to determine the intended meaning of a sentence.
  • Natural language generation (NLG) focuses on text generation, or the construction of text in English or other languages, by a machine and based on a given dataset.

Infuse your data for AI

Natural language processing and its subsets have numerous practical applications within today’s world, like healthcare diagnoses or online customer service.

Explore some of the latest NLP research at IBM or take a look at some of IBM’s product offerings, like Watson Natural Language Understanding. Its text analytics service offers insight into categories, concepts, entities, keywords, relationships, sentiment, and syntax from your textual data to help you respond to user needs quickly and efficiently. Help your business get on the right track to analyze and infuse your data at scale for AI.

Get started with IBM Watson Natural Language Understanding.

Was this article helpful?

More from Artificial intelligence

IBM unveils Cloud Pak for Data 5.0

7 min read - Today’s modern technology landscape is experiencing an explosion of data. Organizations need to be able to trust and access this data to generate meaningful insights. Enter IBM Cloud Pak® for Data 5.0, the newest release of the cloud-native insight platform that integrates the tools needed to collect, organize and analyze data within a data fabric architecture. IBM Cloud Pak for Data 5.0 enhances users’ data strategies by including these new features Immersive Experience: Customers can now streamline their IT and day 2 operations with…

How IBM and AWS are partnering to deliver the promise of responsible AI

4 min read - The artificial intelligence (AI) governance market is experiencing rapid growth, with the worldwide AI software market projected to expand from USD 64 billion in 2022 to nearly USD 251 billion by 2027, reflecting a compound annual growth rate (CAGR) of 31.4% (IDC). This growth underscores the escalating need for robust governance frameworks that ensure AI systems are transparent, fair and comply with increasing regulatory demands. In this expanding market, IBM® and Amazon Web Services (AWS) have strategically partnered to address…

Reimagine data sharing with IBM Data Product Hub

3 min read - We are excited to announce the launch of IBM® Data Product Hub, a modern data sharing solution designed to accelerate data-driven outcomes across your organization. Today, we're making this product generally available to our clients across the world, following its announcement at the IBM Think conference in May 2024. Data sharing has become the lifeblood of modern organizations, fueling growth and driving innovation. But traditional approaches to data sharing can often be a bottleneck constricting the seamless sharing of data.…

IBM Newsletters

Get our newsletters and topic updates that deliver the latest thought leadership and insights on emerging trends.
Subscribe now More newsletters