Watson Natural Language Processing library
The Watson Natural Language Processing library provides natural language processing functions for syntax analysis and pre-trained models for a wide variety of text processing tasks, such as sentiment analysis, keyword extraction, and classification. The Watson Natural Language Processing library is available for Python only.
With Watson Natural Language Processing, you can turn unstructured data into structured data, making the data easier to understand and transferable, in particular if you are working with a mix of unstructured and structured data. Examples of such data are call center records, customer complaints, social media posts, or problem reports. The unstructured data is often part of a larger data record that includes columns with structured data. Extracting meaning and structure from the unstructured data and combining this information with the data in the columns of structured data:
- Gives you a deeper understanding of the input data
- Can help you to make better decisions.
Watson Natural Language Processing provides pre-trained models in over 20 languages. They are curated by a dedicated team of experts, and evaluated for quality on each specific language. These pre-trained models can be used in production environments without you having to worry about license or intellectual property infringements.
Although you can create your own models, the easiest way to get started with Watson Natural Language Processing is to run the pre-trained models on unstructured text to perform language processing tasks.
Some examples of language processing tasks available in Watson Natural Language Processing pre-trained models:
- Language detection: detect the language of the input text
- Syntax: tokenization, lemmatization, part of speech tagging, and dependency parsing
- Entity extraction: find mentions of entities (like person, organization, or date)
- Noun phrase extraction: extract noun phrases from the input text
- Text classification: analyze text and then assign a set of pre-defined tags or categories based on its content
- Sentiment classification: is the input document positive, negative or neutral?
- Tone classification: classify the tone in the input document (like excited, frustrated, or sad)
- Emotion classification: classify the emotion of the input document (like anger or disgust)
- Keywords extraction: extract noun phrases that are relevant in the input text
- Relations: detect relations between two entities
- Hierarchical categories: assign individual nodes within a hierarchical taxonomy to the input document
- HAP detection: identify hateful, abusive, and profane content (HAP content) in texts
- Embeddings: map individual words or larger text snippets into a vector space
Watson Natural Language Processing encapsulates natural language functionality through blocks and workflows. Blocks and workflows support functions to load, run, train, and save a model.
For more information, refer to Working with pre-trained models.
Some examples of how you can use the Watson Natural Language Processing library:
Running syntax analysis on a text snippet:
import watson_nlp
# Load the syntax model for English
syntax_model = watson_nlp.load('syntax_izumo_en_stock')
# Run the syntax model and print the result
syntax_prediction = syntax_model.run('Welcome to IBM!')
print(syntax_prediction)
Extracting entities from a text snippet:
import watson_nlp
entities_workflow = watson_nlp.load('entity-mentions_transformer-workflow_multilingual_slate.153m.distilled')
entities = entities_workflow.run('IBM\'s CEO Arvind Krishna is based in the US', language_code="en")
print(entities.get_mention_pairs())
For examples of how to use the Watson Natural Language Processing library, refer to Watson Natural Language Processing library usage samples.
Using Watson Natural Language Processing in a notebook
To use Natural Language Processing, you need a set of pre-trained Natural Language Processing models. See Specifying additional installation options for default Runtime for Python.
The standard Runtime environments might not be large enough to run notebooks that use the prebuilt models. For example, to run the Syntax and Sentiment models, you need an environment with 1 vCPU and 4 GB RAM. To work with models that require more computing power, you must create a custom environment template of type Default (only CPU) or GPU.
When you create this template, consider the following:
- The environment must have at least 4 GB of memory and one of the following software versions:
- Runtime 24.1 on Python 3.11
- JupyterLab with Runtime 24.1 on Python 3.11
- Use environment type
Default
orGPU
. When you create a custom template, theGPU
option is available only if theJupyter Notebooks with Python for GPU
service is installed.
GPU environments are not available by default. For details, see GPU environments.
Learn more
- Creating your own environment template
- Deploying Natural Language Processing models in Watson Machine Learning
Parent topic: Notebooks and scripts