Streaming record and entity data changes (IBM Match 360)
Configure a master data event stream to propagate changes in your record and entity data directly to downstream systems through a connected Apache Kafka server.
Stream record and entity data changes to ensure that your users and systems always have the most up-to-date master data. You can achieve near real-time data synchronization between IBM Match 360 and different endpoints by using the master data event streaming capability and an Apache Kafka connection. Typical target endpoints are downstream systems that need to synchronize the trusted golden view of mastered data for different analytical and business use cases.
To achieve event streaming, IBM Match 360 uses IBM Cloud Pak for Data connection services. Specifically, it uses the Apache Kafka connection asset to connect to external Kafka clusters and IBM Event Streams. IBM Match 360 supports all of the Kafka variants that the Apache Kafka connection supports. For more information about the Apache Kafka connection, see Apache Kafka connection.
IBM Match 360 supports streaming through Apache Kafka to your target endpoints. You cannot stream data from a external source system to IBM Match 360 through the same mechanism. However, you can achieve a custom inbound data streaming configuration by using DataStage, an Apache Kafka connection, and the IBM Match 360 ongoing synchronization API methods. For more information about this configuration, see the external blog post Real time data ingestion to IBM Match 360 using core platform capabilities.
In this topic:
For more details about master data streaming events, such as message templates and examples, see Master data event streaming message template.
Types of master data events that can be streamed
Downstream systems often need to synchronize their data with the most current master data provided by IBM Match 360. By using the event streaming capability, you can subscribe to data change events at the entity or record level.
IBM Match 360 is a live system. Any time a record is added, updated or deleted, a record change event is created. When the matching engine runs, record changes are included in the matchmaking process and can affect entities. Any time matching gets run, IBM Match 360 creates entity change events to capture newly created entities, changes to the membership of existing entities, or updates to an entity's composite attribute values.
Changes to entity data that trigger entity streaming events
Entity data changes when:
-
There are changes to the entity's member records' attributes that cause different values to be selected by attribute composition rules. For more information about attribute composition rules, see Defining attribute composition rules.
-
A data steward manually adds an entity attribute to an entity.
-
A data steward manually updates an entity attribute in an entity.
Remember: An entity attribute is an attribute whose values are stored directly in the entity, as opposed to being derived from its member records.
The member records of a master data entity can change when:
- Matching gets run after a change in record data (add, update, or delete).
- Matching gets run after a data engineer updates the matching algorithm configuration.
- A data steward manually links or unlinks records.
An entity can be deleted when it no longer has any member records.
Changes to record data that trigger record streaming events
Record data changes when:
- Records are added to IBM Match 360.
- Record values are updated.
- Record values are deleted.
If you create a streaming subscription for entity change events, the underlying record change events are also included in the stream. However, there are some scenarios where you might want to stream only record data, such as:
- If your custom data model includes record types that aren't associated with any entity types.
- If your entity streaming subscription includes a source level filter to include or exclude certain entity types.
- If you want to process record change events separately from entity change events by sending them to a different Kafka topic.
Setting up master data event streaming
Configure a new streaming subscription to start synchronizing your master data entities and records with downstream systems.
The master data event streaming capability supports the following connection types and security types.
Connection type | Security type |
---|---|
Apache Kafka and other vendor-specific variants of Kafka | None, SSL, SASL_SSL, SCRAM-256, SCRAM-512 |
IBM Event Streams | SASL_SSL, SCRAM-256, SCRAM-512 |
To enable the IBM Match 360 master data event streaming capability:
-
Create a Cloud Pak for Data Apache Kafka connection asset in a project or catalog:
a. Create a project or catalog to handle all of your connection assets. For information about creating a project, see Creating a project. For information about creating a catalog, see Creating a catalog.
b. In your project or catalog, go to the Manage tab. In the General section, copy the project ID a project or catalog ID for a catalog. You'll need this ID to use as the container ID when creating your streaming subscription.
c. Go to the Assets tab, and click New asset > Connect to a data source.
d. From the list of connectors, click Apache Kafka, then click Select.
e. Enter the Kafka connection information such as a name, description, and target Kafka server host name. Ensure that the following configuration items are set correctly:
- Set the Credentials option to Shared.
- Disable the Mask sensitive credentials retrieved through API calls option.
- Select a supported connection type and security type for your connection.
-
Generate an API key to authenticate to Cloud Pak for Data APIs. For more information, see Generating an API authorization token.
-
Get the Apache Kafka connection asset ID and your project or catalog ID (container ID). You need these IDs to use as input when creating a master data event streaming subscription. There are two ways you can get these IDs:
-
From the asset URL: Open the Apache Kafka connection asset that you created your project or catalog. Copy the asset ID and container ID from the URL bar in your browser. Refer to the following example:
https://cpd-namespace.apps.samplecp4d.cp.example.com/connections/<ASSET ID>?project_id=<CONTAINER ID>&context=icp4data
-
By using the connection API: Run the following API curl command:
curl --location --request GET 'https://api.dataplatform.cloud.ibm.com/v2/connections?limit=100&entity.datasource_type=f13bc9b7-4a46-48f4-99c3-01d943334ba7&project_id=xxx&userfs=false' --H 'Accept: application/json' -H "Authorization: ZenApiKey ${TOKEN}"
-
-
Use the IBM Match 360 API to create your master data streaming subscription. Use the following API methods in the
model-ms
microservice to create, update, or delete a subscription:
- Create method:
GET /mdm/v1/event_subscription
- Update method:
PUT /mdm/v1/event_subscription
- Delete method:
DELETE /mdm/v1/event_subscription
For information about using the IBM Match 360 API, see IBM Match 360 API reference documentation.
Example event_subscription
payload:
{
"filter": ["person_entity"],
"event_type": "entity_change",
"created_user": "user123",
"last_update_user": "user123",
"stream_connection": {
"stream_type": "Kafka",
"asset_scope": "Project",
"topic": "PersonEntityTopic",
"asset_id": "<ASSET-ID>",
"container_id": "<CONTAINER-ID>"
},
"subscription_description": "Create PersonEntityRecordSub event subscription SASL SSL EventStream",
"subscription_name": "PersonEntityRecordSubSASLSSLEventStream",
"active": true,
"created_date": "1680297619428",
"last_update_date": "1680297619428"
}
Parameter | Value |
---|---|
filter | The filter for the selected event type. This filters the valid entity types or record types, depending on the value of the event_type parameter. |
event_type | The type of event being streamed in this subscription. Supported values are record_change or entity_change . |
created_user | The user who created this subscription. |
last_update_user | The user who most recently updated this subscription. |
stream_connection.stream_type | The supported stream type. The valid value is Kafka . |
stream_connection.asset_scope | Defines whether the Kafka connection is scoped to a Project or Catalog . |
stream_connection.topic | The name of the Kafka topic to which the events are published. |
stream_connection.asset_id | The asset ID of the Apache Kafka connection asset. |
stream_connection.container_id | The container ID of the Apache Kafka connection asset. This is the project ID or catalog ID. |
subscription_description | A description of this subscription. |
subscription_name | The name of this subscription. |
active | The indicator of whether this subscription is active. If set to true , the events are streamed. If set to false , no events are streamed. |
created_date | The date this subscription was created. |
last_update_date | The date this subscription was most recently updated. |
Learn more
- Configuring event streaming in IBM Match 360
- Event streaming message reconciliation
- Enabling logging for event streaming
- IBM Match 360 API reference documentation
- Apache Kafka connection
- Adding data from a connection to a project
Parent topic: Configuring master data