How to get started with Natural Language Processing

The Data & AI Landscape

Digitized data is everywhere, and it’s growing in volume every day, from 500 million Tweets and 300 billion emails sent, to 70 billion messages WhatsApp messages and close to 5 billion searches. According to Visual Capitalist, the digital universe is expected to reach 44 zettabytes by 2020. Beliefs, ideas, opinions, stories, concepts, and more are expressed in human language through a virtually uncountable number of conduits.

With such a volume of textual data created every day, there are countless insights that can be extracted to make critical decisions in almost any industry or business. News articles, financial reports, and other sources of content can be compiled and analyzed for sentiment around company stocks. Transcripts from call centers can be analyzed to determine comments and complaints about a service or product. All of this is possible with the help of natural language processing (NLP).

Introduction to how I work with Natural Language Processing

I’m a product manager for Watson Natural Language Understanding (NLU), IBM’s NLP service. NLP is a massive space within artificial intelligence (AI), and enterprises are currently integrating NLP technologies into their existing platforms more every day. As a product manager for an AI offering, I am tasked with uncovering where the gaps are in the market and what opportunities are out there for customer benefit. The ultimate goal is to create a unique and effective solution with developers, and to bring that solution to market.

Almost every week, customers ask questions around how to better integrate AI into their services, the potential for NLP in their business, and how to put NLU to work for them. To address some of these questions, I am producing a series of blog posts about natural language processing and diving deep into what NLP is, how it can be used, and examples of NLP technologies embedded in solutions that solve real-world problems.

So what is natural language processing?

With petabytes of textual data available each day, companies are trying to figure out how they can structure the data, clean it, and garner deeper insights from it. These steps can be streamlined into a valuable, cost-effective, and easy-to-use process. Natural language processing is the parsing and semantic interpretation of text, allowing computers to learn, analyze, and understand human language. With NLP comes a subset of tools– tools that can slice data into many different angles. NLP can provide insights on the entities and concepts within an article, or sentiment and emotion from a tweet, or even a classification from a support ticket. Hundreds of types of information can be extracted from textual data, and enterprises can leverage this information to better understand customer behavior and improve internal efficiency.

Watson Natural Language Understanding (NLU) is IBM’s NLP product service for text analytics. Our easy-to-use APIs offer insight into categories, concepts, entities, keywords, relationships, sentiment, and syntax from your textual data.

Why isn’t everyone using NLP?

While NLP is an extremely powerful tool to use in existing platforms, it is still relatively young on the overall technology adoption curve. This follows a similar sentiment around AI adoption inside organizations today. In a recent Wall Street Journal article, it was reported that only 18% of 2,473 organizations worldwide had AI models in production, while 16% were in the proof-of-concept stage, and only 15% were experimenting with AI, according to a study from International Data Corp.

I, too, have witnessed this behavior firsthand in many customer interactions. Oftentimes, customers will get excited from the hype around AI, but are unaware of how to fully leverage and utilize AI in their organizations.

At IBM, we’ve found that the starting point for AI-based projects is to use natural language processing. We found this to be true for two primary reasons:

1) Existing data: Almost every organization has some corpus of textual data waiting to be mined for insights. Further, this database will continue to grow, and as organizations’ business strategies evolve, this data will be essential to their vision.

2) Quick MVP & ROI: With easy-to-use API calls, developers can immediately start using and getting value from the technology. Developers can create a quick MVP with little-to-no previous experience working with NLP technology.

For example, I recently created a ticket triaging system to sort through and analyze hundreds of customer support tickets in less than a day with Watson Natural Language Understanding (NLU) and Watson Natural Language Classifier (NLC).

Now what?

Now that you’ve learned a little bit more about natural language processing, check out Watson NLU and Watson NLC, and construct your own MVP with the data you have.

Stay tuned for next month’s NLP blog!