Watson Natural Language Processing Library for Embed home

The Watson Natural Language Processing (NLP) Library for Embed provides natural language processing functions for syntax analysis, and pre-trained models for a wide variety of text processing tasks, in a fully embeddable library.

With Watson NLP, you can turn unstructured data into structured data, making the data easier to understand and transferable, in particular if you are working with a mix of unstructured and structured data. Examples of such data are call center records, customer complaints, social media posts, or problem reports. The unstructured data is often part of a larger data record which includes columns with structured data. Extracting meaning and structure from the unstructured data and combining this information with the data in the columns of structured data, gives you a deeper understanding of the input data and can help you to make better business decisions.

Why Watson NLP?

There are numerous open source NLP libraries - more than 20 the last time we checked!

The popularity of open source NLP libraries changes every 2-3 years, as this graphic shows:

Open source NLP libraries popularity over time ^{Mentions of specific NLP libraries in
books (Source)}

To retain leadership in NLP, IBM believes it needs an NLP library that is:

Built on top of the best AI open source software
Competitive with the best open source NLP libraries
Provides high-quality built-in models for a large variety of languages

Watson NLP has been designed with the above considerations in mind: it exposes core NLP capabilities through standard interfaces, isolating developers from churn in the AI OSS space, while continuously expanding and enhancing the underlying core NLP capabilities.

What is in this offering?

IBM Watson NLP For Embed delivers IBM's state-of-the-art Watson NLP library wrapped in a container image with cloud APIs ready to serve NLP models at internet scale. Also included is access to an extensive catalog of pretrained models ready to run many NLP tasks.

The Watson NLP Runtime image

The Watson NLP Runtime image contains generated proto-API files for all the Watson NLP tasks. These tasks are served by a gRPC runtime - a cross-platform, open source, high performance Remote Procedure Call framework with dynamic, multi-model support. The proto-API files and the gRPC runtime are wrapped in a REST gateway, for quick demos and ease of use.

NLP Runtime container image overview

Watson NLP Pretrained Model Images

The Watson NLP Runtime image on its own doesn't have any models included. Pretrained models are available as images in the IBM Entitled Registry, and can be consumed by the Watson NLP Runtime in a few different deployment modes.

Each runtime image comes with a .zip archive of a pretrained model, along with an unpacking script which can extract the model to a local directory or remote S3 storage.

The available stock models are listed in the Accessing the files section, and are described in more detail in the Working with NLP models section.