Document Processing webhooks with Content Platform Engine

Document Processing generates a custom Document Processing Finalization Event that can be used with the Content Platform Engine event Webhooks External link opens a new window or tab framework to send notifications to Webhook applications when documents are finalized.

Additionally, the Content Services GraphQL API supports retrieving document annotations. The annotation support allows webhook applications to retrieve the raw extracted data associated with a processed document.

Prerequisites

The Document Processing extensions rely on the Content Platform Engine webhook framework to be installed before they can be used. This can be accomplished by using the Administration Console for Content Platform Engine to install the optional add-on component 5.5.4 Event-Driven External Service Invocation Extensions into the target repository. For more information, refer to the topic Installing an add-on feature to an object store External link opens a new window or tab.

An Event Action instance of the webhook External Event Action class must also be created by using the WebhookEventActionHandler as described in the webhook documentation.

Subscribing to a finalization event

The finalization event is triggered when the document processing system completes processing of a document. This occurs when the document is assigned to the target document type and any properties are extracted to the document metadata. To fire a webhook event on finalization, a subscription must be created on the target class with the following values:
  • Scope: Applies to this instance of this class or the metadata for this class
  • Event: Document Processing Finalization Event
  • Event Action: <Webhook Event Action Instance>

Retrieving a document annotation

When a finalization event on a subscribed document type is fired, a webhook notification is sent to the webhook receiver. This receiver can then retrieve a JSON payload that includes the objectStoreId and sourceObjectId associated with the event. This data can be used to make a subsequent GraphQL request to obtain the extracted data that is stored as an annotation on the source document.

A finalized document can have multiple annotations in its collection. Generally, the annotation with the most recent Creation Date of type Document Processing Annotation retrieves the wanted information. Refer to the Example 1 for a GraphQL call to retrieve the downloadUrl from a document’s annotations collection.

Example 1a: GraphQL Annotation Query Request

{
  repositoryObjects(
    repositoryIdentifier: "DevOS1"
    from: "DbaCaptureAnnotation"
    where: "AnnotatedObject = {1E3E075A-A8ED-4273-8A93-93C39B84199A}"
    orderBy: "DateCreated DESC"
    pageSize: 1) 
  {
    independentObjects {
      objectReference {
        repositoryIdentifier
        classIdentifier
        identifier
      }
      properties(includes: ["DateCreated", "Id", "contentElements"]) {
        label
      	id
        value
        ... on ObjectListProperty {
        	objectListValue {
            ... on DependentObject {
              ... on ContentTransfer {
                retrievalName
                downloadUrl
              }
          	}
          }
        }
      }
    }
  }
}

Example 1b: GraphQL Annotation Query Response

{
  "data": {
    "repositoryObjects": {
      "independentObjects": [
        {
          "objectReference": {
            "repositoryIdentifier": "DevOS1",
            "classIdentifier": "DbaCaptureAnnotation",
            "identifier": "{E0414476-0000-C518-93E8-F64DE1B871AC}"
          },
          "properties": [
            {
              "label": "Date Created",
              "id": "DateCreated",
              "value": "2020-12-08T21:29:02.607Z"
            },
            {
              "label": "ID",
              "id": "Id",
              "value": "{E0414476-0000-C518-93E8-F64DE1B871AC}"
            },
            {
              "label": "Content Elements",
              "id": "ContentElements",
              "value": null,
              "objectListValue": [
                {
                  "retrievalName": "content.json",
                  "downloadUrl": "/content?repositoryIdentifier=DevOS1&annotationId={E0414476-0000-C518-93E8-F64DE1B871AC}&elementSequenceNumber=0"
                }
              ]
            }
          ]
        }
      ]
    }
  }
}