Creating custom schemas for key-value pair extraction

Create JSON schemas to extract specific fields from structured documents with the text extraction API.

To build a custom schema for a document, you must define metadata and write effective descriptions for each field you want to extract before you validating and scaling the schema for accurate key-value pair extraction.

Before you begin

Review your document and determine the following information that will guide the field names and descriptions you define in the schema:

The types of data you want to extract from the document
The exact labels for the data you want to extract
The location of each item on the page, such as upper-left header or right-hand column

For example, note the following information in the California Personal Auto Insurance Application document:

Screenshot of an auto application form PDF with several fields including Contact Name and Phone

Data you want to extract such as Agency Name, Applicant Address, Carrier Name, Policy Number.
The exact field labels such as “AGENCY”, “APPLICANT’S NAME AND MAILING ADDRESS”, and “POLICY #”. Quoting the labels as they appear in the document helps the foundation model connect to the correct values.

Procedure

In the metadata at the top of your schema, provide a description for your document in the document_description field. The document_description field is included in the classifier prompt for the foundation model used for key-value pair extraction.

Important: Make the description specific by including keywords to help the classification model correctly identify the document.

Use the `additional_prompt_instructions` parameter to provides guidance that the foundation model can apply to the entire page in the document. For example, use keywords such as “California,” “Auto,” and “Application,” in the description for the *California Personal Auto Insurance Application* document.

{
   "document_type": "Auto_Insurance_Application",
   "document_description": "California Personal Auto Application form used to open or update an auto policy.",
   "additional_prompt_instructions": "Return phone numbers exactly as they appear in the document.",
}

Determine which fields to include in the schema. For each field, define the following three elements:
- Field name: Choose a unique key name for the field.
- Example value: Provide a sample value to help the model infer the expected type such as a date value or an integer. Supplying an example improves model performance.
- Description: Write a brief explanation of what the field represents. The description is passed to the foundation model to help the model understand what to look for during the extraction process.
  Important: The field description provides context that helps the model validate and focus on the correct information in the document. The description must be accurate and unambiguaous.
For example, define a field to extract the agency name from the California Personal Auto Insurance Application document.
```
"agency_name": {
  "default": "",
  "example": "Spring Insurance",
  "description": "Name of the insurance agency shown in the Agency section (upper‑left of the page)."
}
```
Specify additional attribute values in your schema with the available_options parameter. Use the parameter for a field that is not explicitly mentioned in the document, but can be deduced from the context or visual elements. For example, in invoices, currency values may appear in various parts of the document with a dollar sign, but may not explicitly mention that the US dollar is the currency of the invoice. In such cases, you can provide a closed list of valid currency values the model can return and reduce hallucinations in the model response.
```
"currency": {
  "default": "",
  "example": "USD",
  "available_options": ["USD", "EUR", "CNY", "JPY", "GBP", "AUD", "CAD", "CHF", "HKD", "SGD", "INR", "KRW", "MXN", "BRL", "ZAR", "SEK", "NOK", "DKK", "NZD", "TRY", "AED", "THB", "PLN", "IDR", "MYR", "PHP", "RUB", "CZK", "ILS"],
  "description": "The currency used in the invoice."
}
```
Optional: Validate your JSON schema locally before using the schema in your text extraction request to make it is well-formed and matches the expected structure. You can use tools such as:
- jsonlint.com to check formatting
- A Python script to load and inspect the schema
- Your IDE’s built-in JSON linter

Custom schema and API request example

The following command submits a request to extract text by using a complete custom schema that includes all required metadata at the top, followed by a set of fields with accompanying definitions. Each field contains a default value that is empty, an example, and a description to guide the model during the extraction process.

curl -X POST \
  'https://{region}.ml.cloud.ibm.com/ml/v1/text/extractions?version=2024-10-18' \
  --header 'Accept: application/json' \
  --header 'Content-Type: application/json' \
  --header 'Authorization: Bearer eyJraWQiOi...'

The request body is as follows:

{
    "project_id": "e40e5895-ce4d-42a3-b699-8ac764b89a09",
    "document_reference": {
      "type": "connection_asset",
      "connection": {
        "id": "5c0cefce-da57-408b-b47d-58f7785de3ee"
      },
      "location": {
        "bucket":"my-cloud-object-storage-bucket",
        "file_name": "ca_auto_insurance_app.pdf"
      }
    },
    "results_reference": {
      "type": "connection_asset",
      "connection": {
        "id": "5c0cefce-da57-408b-b47d-58f7785de3ee"
      },
      "location": {
        "bucket":"my-cloud-object-storage-bucket",
        "file_name": "results_data"
      }
    },
    "parameters": {
      "requested_outputs": [
        "assembly",
        "md",
        "html",
        "plain_text",
        "page_images",
      ],
      "languages": [
        "en"
      ],
      "mode": "standard",
      "ocr_mode": "enabled",
      "create_embedded_images": "disabled",
      "semantic_config": {
        "schemas": [ {
           "document_type": "Auto_Insurance_Application",
           "document_description": "A California Personal Auto Application form used to collect information necessary for initiating or updating an auto insurance policy. It includes agency, applicant, carrier, and policy details such as contact information, address, policy number, and effective/expiration dates.",
           "additional_prompt_instructions": "Return phone numbers and policy numbers exactly as they appear in the document.",
           "fields": {
              "agency_name": {
                "default": "",
                "example": "Spring Insurance",
                "description": "Name of the insurance agency handling the auto application."
              },
              "applicant_name": {
                "default": "",
                "example": "John Smith",
                "description": "Full name of the person applying for auto insurance."
              },
              "applicant_address": {
                "default": "",
                "example": "245 W 52nd St, Apt 8B, New York, NY 10019",
                "description": "Mailing address of the applicant including street, apartment, city, state, and ZIP code."
              },
              "applicant_phone": {
                "default": "",
                "example": "(917) 555-2843",
                "description": "Phone number for contacting the applicant."
              },
              "applicant_email": {
                "default": "",
                "example": "john.smith@gmail.com",
                "description": "Email address of the applicant."
              },
              "carrier_name": {
                "default": "",
                "example": "Tower Insurance Company",
                "description": "Name of the insurance carrier providing the policy."
              },
              "policy_number": {
                "default": "",
                "example": "10",
                "description": "Unique identifier for the insurance policy."
              },
              "effective_date": {
                "default": "",
                "example": "2023-01-01",
                "description": "Date when the insurance policy becomes effective."
              },
              "expiration_date": {
                "default": "",
                "example": "2024-01-01",
                "description": "Date when the insurance policy expires."
              }
           }
        } ]
      }
    }
  }

Best practices

Use the following best practices to write effective schemas:

Field naming conventions: Use underscores to separate words. For example, use applicant_name instead of applicantName. Keep names short but descriptive. For fields in sections, use the format [section_name]_[field_name] For table fields, use the fomat [table_name]_row_[row_number]_[column_name]
Write effective descriptions: Be specific about where on the document the information is located. Do not include instructions that change the format of the values such as dates or numbers. Mention any labels or headings that identify the field. Note any special cases or variations.

Additional Prompt Instructions Use foundation model prompt instructions to improve extraction accuracy for specific document types, such as, "Preserve number formatting as seen in the image.".

Troubleshooting

The following table describes some common issues when you use a custom schema and how to resolve them:

Symptom	Cause	Solution
No values returned	Description is too vague.	Make the description more specific. Mention visual location or nearby labels. Include an instruction to return the text as-is without formatting changes
Wrong value extracted	Ambiguous field names like “Name”	Use qualified names such as `agency_name` or `applicant_name`.

Learn more

watsonx.ai API reference documentation

Parent topic: Text extraction parameters