Classification
At a glance
The classification
task encapsulates algorithms for document classification: classifying the input text into one or more of a pre-determined set of labels.
The task offers implementations of strong classification algorithms from three different families: classic ML, deep-learning and transformers. It supports multi-label and multi-class problems and its special cases: single-label, multi-class tasks and, respectively, binary classification tasks.
Class definitions |
---|
watson_nlp.blocks.classification.bert.BERT |
watson_nlp.blocks.classification.transformer.Transformer |
watson_nlp.workflows.classification.ensemble.Ensemble |
BERT is a transformer-based architecture, built for multi-class and multi-label text classification on short texts. Utilizes Multilingual BERT pretrained models.
Transformer is a transformer-based architecture, built for multi-class and multi-label text classification on short texts. Utilizes BERT and RoBERTa pretrained models.
Ensemble is a weighted ensemble of SVM and CNN algorithms; it computes the weighted mean of a set of classification predictions using confidence scores. SVM is a support vector machine classifier, which may be trained using any type of input embedding / vectorization task's predictions as feature vectors, e.g., USE embeddings or TF-IDF vectorizers. SVM supports multi-class and multi-label text classification and produces confidence scores via Platt Scaling. CNN is a simple convolutional network architecture, built for multi-class and multi-label text classification on short texts. CNN utilizes GloVe embeddings.
Pretrained models
Model names are listed below. For language support, see Supported languages.
Model ID | Container Image |
---|---|
ensemble-workflow | |
classification_ensemble-workflow_lang_en_tone-stock | cp.icr.io/cp/ai/watson-nlp_classification_ensemble-workflow_lang_en_tone-stock:1.4.1 |
classification_ensemble-workflow_lang_fr_tone-stock | cp.icr.io/cp/ai/watson-nlp_classification_ensemble-workflow_lang_fr_tone-stock:1.4.1 |
transformer | |
classification_transformer_lang_multilingual_slate.153m.distilled.tone | cp.icr.io/cp/ai/watson-nlp_classification_transformer_lang_multilingual_slate.153m.distilled.tone:1.4.1 |
classification_transformer_lang_multilingual_slate.153m.distilled.tone-cpu | cp.icr.io/cp/ai/watson-nlp_classification_transformer_lang_multilingual_slate.153m.distilled.tone-cpu:1.4.1 |
The models have been tested on data from news reports and general web pages.
Running models
The Classification model request accepts the following fields:
Field | Type | Required Optional Repeated |
Description |
---|---|---|---|
raw_document |
watson_core_data_model.nlp.RawDocument |
required | The input document on which to perform Classification predictions |
Example requests
REST API
curl -s \
"http://localhost:8080/v1/watson.runtime.nlp.v1/NlpService/ClassificationPredict" \
-H "accept: application/json" \
-H "content-type: application/json" \
-H "Grpc-Metadata-mm-model-id: classification_ensemble-workflow_lang_en_tone-stock" \
-d '{ "raw_document": { "text": "I hate school. School is bad." } }'
Response
{"classes":[
{"className":"frustrated", "confidence":0.74309075},
{"className":"sad", "confidence":0.20021306},
{"className":"impolite", "confidence":0.07343281},
{"className":"excited", "confidence":0.029446114},
{"className":"sympathetic", "confidence":0.02796789},
{"className":"polite", "confidence":0.016257437},
{"className":"satisfied", "confidence":0.01131451}],
"producerId":{
"name":"Voting based Ensemble",
"version":"0.0.1"
}
}
Python
import grpc
from watson_nlp_runtime_client import (
common_service_pb2,
common_service_pb2_grpc,
syntax_types_pb2,
)
channel = grpc.insecure_channel("localhost:8085")
stub = common_service_pb2_grpc.NlpServiceStub(channel)
request = common_service_pb2.ClassificationRequest(
raw_document=syntax_types_pb2.RawDocument(text="I hate school. School is bad."),
)
response = stub.ClassificationPredict(
request, metadata=[("mm-model-id", "classification_ensemble-workflow_lang_en_tone-stock")]
)
print(response)
Response
classes {
class_name: "frustrated"
confidence: 0.743090749
}
classes {
class_name: "sad"
confidence: 0.20021306
}
classes {
class_name: "impolite"
confidence: 0.0734328106
}
classes {
class_name: "excited"
confidence: 0.0294461139
}
classes {
class_name: "sympathetic"
confidence: 0.0279678907
}
classes {
class_name: "polite"
confidence: 0.0162574369
}
classes {
class_name: "satisfied"
confidence: 0.0113145104
}
producer_id {
name: "Voting based Ensemble"
version: "0.0.1"
}