Streaming record and entity data changes (IBM Match 360)
Configure a master data event stream to propagate changes in your record and entity data directly to downstream systems through a connected Apache Kafka server.
Stream record and entity data changes to ensure that your users and systems always have the most up-to-date master data. You can achieve near real-time data synchronization between IBM Match 360 and different endpoints by using the master data event streaming capability and an Apache Kafka connection. Typical target endpoints are downstream systems that need to synchronize the trusted golden view of mastered data for different analytical and business use cases.
To achieve event streaming, IBM Match 360 uses IBM Cloud Pak for Data connection services. Specifically, it uses the Apache Kafka connection asset to connect to external Kafka clusters and IBM Event Streams. IBM Match 360 supports all of the Kafka variants that the Apache Kafka connection supports. For more information about the Apache Kafka connection, see Apache Kafka connection.
IBM Match 360 supports streaming through Apache Kafka to your target endpoints. You cannot stream data from a external source system to IBM Match 360 through the same mechanism. However, you can achieve a custom inbound data streaming configuration by using DataStage, an Apache Kafka connection, and the IBM Match 360 ongoing synchronization API methods. For more information about this configuration, see the external blog post Real time data ingestion to IBM Match 360 using core platform capabilities.
In this topic:
- Types of master data events that can be streamed
- Configuring master data event streaming
- Enabling logging for master data event streaming
For more details about master data streaming events, such as message templates and examples, see Master data event streaming message template.
Types of master data events that can be streamed
Downstream systems often need to synchronize their data with the most current master data provided by IBM Match 360. By using the event streaming capability, you can subscribe to data change events at the entity or record level.
IBM Match 360 is a live system. Any time a record is added, updated or deleted, a record change event is created. When the matching engine runs, record changes are included in the matchmaking process and can affect entities. Any time matching gets run, IBM Match 360 creates entity change events to capture newly created entities, changes to the membership of existing entities, or updates to an entity's composite attribute values.
- Changes to entity data that trigger streaming events
-
The member records of a master data entity can change when:
- Matching gets run after a change in record data (add, update, or delete).
- Matching gets run after a data engineer updates the matching algorithm configuration.
- A data steward manually links or unlinks records.
-
The attribute values of a master data entity can change when:
- A data steward manually updates an entity attribute value.
- There are changes to the entity's member records' attributes that cause different values to be selected by attribute composition rules. For more information about attribute composition rules, see Defining attribute composition rules.
-
An entity can be deleted when it no longer has any member records.
- Changes to record data that trigger streaming events
-
Record data changes when:
- Records are added to IBM Match 360.
- Records are updated.
- Records are deleted.
-
If you create a streaming subscription for entity change events, the underlying record change events are also included in the stream. However, there are some scenarios where you might want to stream only record data, such as:
- If your custom data model includes record types that aren't associated with any entity types.
- If your entity streaming subscription includes a source level filter to include or exclude certain entity types.
- If you want to process record change events separately from entity change events by sending them to a different Kafka topic.
Configuring master data event streaming
Configure a new streaming subscription to start synchronizing your master data entities and records with downstream systems.
The master data event streaming capability supports the following connection types and security types.
Connection type | Security type |
---|---|
Apache Kafka and other vendor-specific variants of Kafka | None, SSL, SASL_SSL, SCRAM-256, SCRAM-512 |
IBM Event Streams | SASL_SSL, SCRAM-256, SCRAM-512 |
To enable the IBM Match 360 master data event streaming capability:
-
Create a Cloud Pak for Data Apache Kafka connection asset in a project or catalog:
a. Create a project or catalog to handle all of your connection assets. For information about creating a project, see Creating a project. For information about creating a catalog, see Creating a catalog.
b. In your project or catalog, go to the Manage tab. In the General section, copy the project ID a project or catalog ID for a catalog. You'll need this ID to use as the container ID when creating your streaming subscription.
c. Go to the Assets tab, and click New asset > Connection.
d. From the list of connectors, click Apache Kafka, then click Select.
e. Enter the Kafka connection information such as a name, description, and target Kafka server host name. Ensure that the following configuration items are set correctly:
- Set the Credentials option to Shared.
- Disable the Mask sensitive credentials retrieved through API calls option.
- Select a supported connection type and security type for your connection.
-
Get the Apache Kafka connection asset ID and your project or catalog ID (container ID). You need these IDs to use as input when creating a master data event streaming subscription. There are two ways you can get these IDs:
-
From the asset URL: Open the Apache Kafka connection asset that you created your project or catalog. Copy the asset ID and container ID from the URL bar in your browser. Refer to the following example:
https://cpd-namespace.apps.samplecp4d.cp.example.com/connections/<ASSET ID>?project_id=<CONTAINER ID>&context=icp4data
-
By using the connection API: Run the following API curl command:
curl --location --request GET 'https://api.dataplatform.cloud.ibm.com/v2/connections?limit=100&entity.datasource_type=f13bc9b7-4a46-48f4-99c3-01d943334ba7&project_id=xxx&userfs=false' --header 'Accept: application/json' --header 'Authorization: Bearer xxx’
-
-
Use the IBM Match 360 API to create your master data streaming subscription. Use the following API methods in the
model-ms
microservice to create, update, or delete a subscription:
- Create method:
GET /mdm/v1/event_subscription
- Update method:
PUT /mdm/v1/event_subscription
- Delete method:
DELETE /mdm/v1/event_subscription
For information about using the IBM Match 360 API, see IBM Match 360 API reference documentation.
Example event_subscription
payload:
{
"filter": ["person_entity"],
"event_type": "entity_change",
"created_user": "user123",
"last_update_user": "user123",
"stream_connection": {
"stream_type": "Kafka",
"asset_scope": "Project",
"topic": "PersonEntityTopic",
"asset_id": "<ASSET-ID>",
"container_id": "<CONTAINER-ID>"
},
"subscription_description": "Create PersonEntityRecordSub event subscription SASL SSL EventStream",
"subscription_name": "PersonEntityRecordSubSASLSSLEventStream",
"active": true,
"created_date": "1680297619428",
"last_update_date": "1680297619428"
}
Parameter | Value |
---|---|
filter | The filter for the selected event type. This filters the valid entity types or record types, depending on the value of the event_type parameter. |
event_type | The type of event being streamed in this subscription. Supported values are record_change or entity_change . |
created_user | The user who created this subscription. |
last_update_user | The user who most recently updated this subscription. |
stream_connection.stream_type | The supported stream type. The valid value is Kafka . |
stream_connection.asset_scope | Defines whether the Kafka connection is scoped to a Project or Catalog . |
stream_connection.topic | The name of the Kafka topic to which the events are published. |
stream_connection.asset_id | The asset ID of the Apache Kafka connection asset. |
stream_connection.container_id | The container ID of the Apache Kafka connection asset. This is the project ID or catalog ID. |
subscription_description | A description of this subscription. |
subscription_name | The name of this subscription. |
active | The indicator of whether this subscription is active. If set to true , the events are streamed. If set to false , no events are streamed. |
created_date | The date this subscription was created. |
last_udate_date | The date this subscriptino was most recently updated. |
Enabling logging for master data event streaming
To enable WebSphere Liberty Profile logging for the IBM Match 360 event streaming capability:
-
From the OpenShift console, go to Administration > CustomResourceDefinition. Search for and select MasterDataManagement (CRD).
-
Select the Instance tab, then select mdm-cr and open the YAML tab.
-
To enable logging parameters, add the following lines to the
mdm-cr
YAML:wlp: logging: trace: specification: "com.ibm.mdmx.common.events.*=all:com.ibm.entity.matching.operational.core.streaming.*=all"
-
Click Save, then reload the page to ensure that the parameters were added correctly.
After enabling logging, it takes some time for the
mdm-cr
to get resolved. Wait for the Cloud Pak for Data instance to show as enabled. -
Review the logs to check whether IBM Match 360 successfully sends event messages to your Kafka endpoint.
Learn more
- Configuring event streaming in IBM Match 360
- IBM Match 360 API reference documentation
- Apache Kafka connection
- Adding data from a connection to a project
Parent topic: Configuring master data