Natural language processing, or NLP, combines computational linguistics—rule-based modeling of human language—with statistical and machine learning models to enable computers and digital devices to recognize, understand and generate text and speech.
A branch of artificial intelligence (AI), NLP lies at the heart of applications and devices that can
often in real time. Today most people have interacted with NLP in the form of voice-operated GPS systems, digital assistants, speech-to-text dictation software, customer service chatbots, and other consumer conveniences. But NLP also plays a growing role in enterprise solutions that help streamline and automate business operations, increase employee productivity, and simplify mission-critical business processes.
Use this model selection framework to choose the most appropriate model while balancing your performance requirements with cost, risks and deployment needs.
Register for the white paper on AI governance
Human language is filled with ambiguities that make it incredibly difficult to write software that accurately determines the intended meaning of text or voice data. Homonyms, homophones, sarcasm, idioms, metaphors, grammar and usage exceptions, variations in sentence structure—these just a few of the irregularities of human language that take humans years to learn, but that programmers must teach natural language-driven applications to recognize and understand accurately from the start, if those applications are going to be useful.
Several NLP tasks break down human text and voice data in ways that help the computer make sense of what it's ingesting. Some of these tasks include the following:
See the blog post “NLP vs. NLU vs. NLG: the differences between three natural language processing concepts” for a deeper look into how these concepts relate.
The all new enterprise studio that brings together traditional machine learning along with new generative AI capabilities powered by foundation models.
The Python programing language provides a wide range of tools and libraries for attacking specific NLP tasks. Many of these are found in the Natural Language Toolkit, or NLTK, an open source collection of libraries, programs, and education resources for building NLP programs.
The NLTK includes libraries for many of the NLP tasks listed above, plus libraries for subtasks, such as sentence parsing, word segmentation, stemming and lemmatization (methods of trimming words down to their roots), and tokenization (for breaking phrases, sentences, paragraphs and passages into tokens that help the computer better understand the text). It also includes libraries for implementing capabilities such as semantic reasoning, the ability to reach logical conclusions based on facts extracted from text.
The earliest NLP applications were hand-coded, rules-based systems that could perform certain NLP tasks, but couldn't easily scale to accommodate a seemingly endless stream of exceptions or the increasing volumes of text and voice data.
Enter statistical NLP, which combines computer algorithms with machine learning and deep learning models to automatically extract, classify, and label elements of text and voice data and then assign a statistical likelihood to each possible meaning of those elements. Today, deep learning models and learning techniques based on convolutional neural networks (CNNs) and recurrent neural networks (RNNs) enable NLP systems that 'learn' as they work and extract ever more accurate meaning from huge volumes of raw, unstructured, and unlabeled text and voice data sets.
For a deeper dive into the nuances between these technologies and their learning approaches, see “AI vs. Machine Learning vs. Deep Learning vs. Neural Networks: What’s the Difference?”
Natural language processing is the driving force behind machine intelligence in many modern real-world applications. Here are a few examples:
Accelerate the business value of artificial intelligence with a powerful and flexible portfolio of libraries, services and applications.
Infuse powerful natural language AI into commercial applications with a containerized library designed to empower IBM partners with greater flexibility.
Learn the fundamental concepts for AI and generative AI, including prompt engineering, large language models and the best open source projects.
Learn about different NLP use cases in this NLP explainer.
Visit the IBM Developer's website to access blogs, articles, newsletters and more. Become an IBM partner and infuse IBM Watson embeddable AI in your commercial solutions today. BM Watson NLP Library for Embed into your solutions.
Watch IBM Data & AI GM, Rob Thomas as he hosts NLP experts and clients, showcasing how NLP technologies are optimizing businesses across industries.
Ethical considerations for AI have never been more critical than they are today.
IBM has launched a new open-source toolkit, PrimeQA, to spur progress in multilingual question-answering systems to make it easier for anyone to quickly find information on the web.