Text classification

You can classify text in your document with the text classification API to identify whether the data in your file matches the key-value pair format in schema definitions for various document types.

By pre-processing the document, you can quickly verify whether a document is classified into one of the pre-defined schemas or a custom schema without performing key-value pair extraction which can be a longer resource intensive process. You can then decide which schema to use to correctly extract text into fields in a key-value pair format.

Compatibility and specifications

Cloud platforms

Supported input file types

You can classify text from documents in different languages, or from a document that has a mix of multiple languages. Classify text from the following file types:

Supported document formats:

File type Extension Text classification support
PDF file .pdf Yes
Powerpoint .pptx .ppt Yes
Word document .docx .doc Yes
Excel .xlsx No

Supported image formats:

File type Extension Text classification support
BMP image .bmp Yes
HEIC/HEIF image .heic .heif Yes
JFIF image .jfif Yes
JPEG image .jpeg .jpg Yes
PNG image .png Yes
TIFF image/multi-image .tif .tiff Yes

Other supported formats:

File type Extension Text classification support
HTML .html Yes
Markdown .md No
Note: You cannot use the text classification API to process key-value pair data to classify XLSX documents.
Supported storage types

You can store your input documents in the following connected storage types:

  • IBM Cloud Object Storage

  • Amazon S3

  • Any generic Amazon S3-compatible storage

  • Box

  • IBM watsonx.data SharePoint

  • IBM FileNet P8

    Note: The IBM FileNet P8 connection is only available in the Toronto data center and for a managed cloud service provider (MCSP).

For details about how to create a connection to the various types of data stores in your project, see Connectors for watsonx.ai.

Supported foundation models

The text classification API is certified to use the mistral-small-3-1-24b-instruct-2503 model for key-value pair classification.

You can also use alternative models that can process visual input and respond in a JSON format such as:

  • llama-4-maverick-17b-128e-instruct-fp8
  • mistral-medium-2505

For foundation model details, see Supported foundation models.

Supported languages

The text classification API has been tested with English-language documents. The API is expected to work with other languages that are supported by the underlying vision language model.

Ways to work

You must generate credentials to authenticate with watsonx.ai APIs. For details, see Generating a bearer token.

You can classify text from documents stored in your watsonx.ai project with these programmatic methods:

REST API

You can classify text from files in IBM watsonx.ai programmatically by using the text classification method of the watsonx.ai REST API.

For details about how to customize a text classification request, see Text classification parameters.

For API method details, see the watsonx.ai API reference documentation.

Python

You can extract text from files in IBM watsonx.ai programmatically by using the Python library.

See the TextClassification class of the watsonx.ai Python library.

Node.js

You can classify text from files in IBM watsonx.ai programmatically by using the Node.js SDK. For more information, see the following resources:

Learn more