Document Processing event webhooks
A webhook is a way for IBM Automation Document Processing to provide near real-time information to other interested applications or services. When the subscribed event occurs, Document Processing makes an HTTP POST request to the URL that is configured for the webhook.
Overview
Webhooks are user-defined HTTP callbacks that are made with HTTP POST. They provide a loosely coupled means of integration between different services. Document Processing supports making such callbacks when the document processing completes.
In order for a webhook implementation to receive these event notifications, it must be registered with the Document Processing project that is the source of the intended events. Registration includes providing a callback URL, authentication information, and signature information. When the event happens, it makes an HTTP request to the webhook URL with the event payload as JSON and security tokens. For example, when the document processing completes, it can notify users with webhook.
You register, update, or delete a webhook in the settings of your project in Document Processing Designer. For more information, see Configuring your project settings.
Payload
After the event occurs in Document Processing, the server looks at the current subscriptions. If there is any subscription for the event, the server serializes the event into JSON and sends it as an HTTP POST payload to the webhook receiver.
- eventId
- The ID of an event. The webhook can use this ID to check for duplicate receipts of the event payload.
- eventDateTime
- The date and time of the event, in Coordinated Universal Time time zone.
- eventType
- The type of the webhook event.
- webhookId
- The unique ID of the webhook, which is generated when you register the webhook.
- sourceObjectId
- The event source object ID. Webhooks can use this ID to retrieve the object that caused this event.
- properties
- The detailed information, which might be different for each event
type:
DocumentProcessingComplete content: { status: “success” or “fail“, uniqueId: “xxx“, analyzerId: “xxxxx“ } ClassificationModelTrainingComplete content: { status: “success” or “fail“, trainingId: “xxxxx“ } ExtractionModelTrainingComplete content: { status: “success” or “fail“, trainingId: “xxxxx“ } ProjectExportingComplete content: { status: “success” or “fail“, actionId: “xxxxx“ } ProjectImportingComplete content: { status: “success” or “fail“, actionId: “xxxxx“ }
{
eventId: “xxxxx“,
eventDateTime: ““,
eventType: ““,
webhookId: ““,
projectId: “xxx“,
sourceObjectId: “xxx“,
receiverRegistrationId: “xxxx“,
properties: {
...
}
}Security and validation
The webhook request and response happen asynchronously. Sometimes the webhook might be waiting for the desired event for a long time. When Document Processing sends the event response to webhook, the server must secure the payload so that the integrity of the data is verifiable and the source of the data is confirmed.
If the authorizationToken parameter is not blank, it is attached as
Authorization HTTP header.
signature parameter is not blank, then the signature of the request body
is computed and attached as a Digest HTTP header, for
example:Digest: SHA-256=X48E9qOokqqrvdts8nOJRJN3OWDUoyWxBf7kbu9DBPE=Document Processing sends the
request body along with a signature that is generated by using the body and the HMAC cryptography
algorithm to the client. Document Processing uses the Hash-based
Message Authentication Code, or HMAC to provide both message integrity and authentication. HMAC is
based on sharing a private secret key between the webhook and the Document Processing server. During the webhook
registration, the credentials property is supplied with the JSON that contains the credential type
and credentialSecret.
{
CredentialType: “HMAC”,
CredentialSecret: “xxxxxxxx”
}When Document Processing works on
the event queue it generates the HMAC code and adds the code into the HTTP header HMAC. The payload
for the webhook request is in plain text. The webhook, upon receiving the POST request from Document Processing, can read the HTTP header
with the HMAC name and compute the HMAC, the JSON Payload, and the CredentialSecret
that are shared with the Document Processing server.
If the computed HMAC is different from the HMAC that was included in the header, it implies that the data is being tampered with or is compromised, and the webhook can discard the message and send the appropriate HTTP status back. The Document Processing server is going to retry posting the event after 1 second until it exhausts the retry count or receives the 200 status code from webhook.