Data stream
You can use the Data stream feature in IBM Process Mining to add a stream source to an existing process. In IBM Process Mining, you can create a data stream connection by using the following:
- IBM Event Streams
- Kafka
- IBM MQ
To ensure secure connections to IBM Event Streams and Kafka, the security.protocol
property of the applications is set to SASL_SSL and the sasl.mechanism
property is set to PLAIN.
For more information, see the following links:
- Creating a data stream connection using IBM Event Streams
- Creating a data stream connection using Kafka
- Creating a data stream connection using IBM MQ
- Managing a data stream connection
Creating a data stream connection using IBM Event Streams
It is mandatory to have an IBM Event Streams account in IBM Cloud before you create a data stream connection using IBM Event Streams. For more information, see IBM Event Streams.
Prerequisites for creating IBM Event Streams connection
-
Create an Event Streams account in IBM Cloud. For more information, see Configure Event Streams
-
On the Topics page of the IBM Event Streams account, create a topic to store events in IBM Event Streams. For more information, see Creating a Kafka topic.
Figure 1. Topics for IBM Event Streams
-
On the Home page, use the Connect to this service menu to get the configuration properties.
Figure 2. Configuration properties for IBM Event Streams
-
Collect the API Key from the Service credentials page.
Figure 3. API key for IBM Event Streams
Procedure
Do the following steps to add a data stream connection by using IBM Event Streams:
-
Go to Manage > Data stream.
-
On the Data stream page, click Add data stream.
Figure 4. Add data stream
-
On the Connection page of the Integrate data stream wizard, do the following steps:
a. Click IBM Event Streams.
b. In the Name field, type a name for the data stream.
c. Optional: In the Summary field, type a description for the connection.
d. In the Topic field, type the name of the channel that you use to receive the messages in IBM Event Streams. For more information, see Figure 2 in the Prerequisites for creating IBM Event Streams connection section.
e. Optional: In the Group ID field, type the username of the connection that you use in IBM Event Streams.
ⓘ Note: The Group ID determines the group to which the consumer belongs. Although it is a mandatory field to create a data stream connection, in IBM Process Mining you can opt to leave the field blank. In such instances, the application considers the name of selected process as the value for Group ID.
f. In the IBM Event Streams configuration properties field, type the configuration properties of IBM Event Streams for a secure connection. For more information, see Figure 3 in the Prerequisites for creating IBM Event Streams connection section.
g. In the API key field, type the API Key that is used in Service credentials in IBM Event Streams. For more information, see Figure 4 in the Prerequisites for creating IBM Event Streams connection section.
h. Click Verify connection to validate the accuracy of the information typed in all the fields, and then click Next.
Figure 5. Creating an IBM Event Streams connection
-
On the Mapping page, do the following steps:
a. Optional: In the Paste your sample message field, type the sample JSON message that you receive from IBM Event Streams. This helps you to map the fields to the correct JSON selector.
b. In the Required mapped pairs section, map the data source columns to the corresponding JSON selector.
c. Optional: Click Add mapped pair to map other data source fields to the corresponding JSON selector.
d. Click Next.
Figure 6. Data mapping for IBM Event Streams connection
ⓘ Note: Following are the advantages of including a sample JSON message:
-
The JSON selectors that are entered in the Required mapped pairs section are validated with the sample JSON message.
-
The system automatically presents the extracted values for all fields and it verifies that mandatory fields are found and are not empty (as illustrated in the following figure).
-
-
On the Scheduling page, enter the timeframe for running the data stream, and then click Create.
Figure 7. Scheduling the connection timeframe
Creating a data stream connection using Kafka
Do the following steps to add a data stream connection using Kafka:
-
Go to Manage > Data stream.
-
On the Data stream page, click Add data stream.
Figure 8. Add a data stream
-
On the Connection page of the Integrate data stream wizard, do the following steps:
a. Click Kafka.
b. In the Name field, type a name for the data stream.
c. Optional: In the Summary field, type a description for the connection.
d. In the Topic field, type the name of the channel that you use to receive the messages. For more information, see Figure 2 in the Prerequisites for creating IBM Event Streams connection section.
e. Optional: In the Group ID field, type a unique name. Note that the value in the Group ID field is a string that uniquely identifies the group of consumer processes to which the consumer belongs.
f. In the Host field, type the host and the port number.
g. In the Kafka properties field, type the configuration properties of Kafka for a secure connection.
h. In the Password field, type the password for Kafka.
i. Click Verify connection to validate the accuracy of the information typed in all the fields, and then click Next.
Figure 9. Creating a data stream connection by using Kafka
-
On the Mapping page, do the following steps:
a. Optional: In the Paste your sample message field, type the sample JSON message that you receive from Kafka. This helps you to map the fields to the correct JSON selector.
b. In the Required mapped pairs section, map the data source columns to the corresponding JSON selector.
c. Optional: Click Add mapped pair to map other data source fields to the corresponding JSON selector.
d. Click Next.
Figure 10. Data mapping for Kafka connection
ⓘ Note: When you include a sample JSON message, the JSON selectors that are entered in the Required mapped pairs section are validated with the sample JSON message.
-
On the Scheduling page, enter the timeframe for running the data stream, and then click Create.
Figure 11. Scheduling the timeframe for Kafka connection
Creating a data stream connection using IBM MQ
It is mandatory to have an IBM MQ Connector account in IBM Cloud before you create a data stream connection using IBM MQ. For more information, Getting started on IBM MQ.
Prerequisites for creating IBM MQ connection
-
Follow the procedure mentioned in Getting started on IBM MQ to create an IBM MQ account in IBM Cloud.
-
Download the connection information in JSON text format from the IBM MQ account in IBM Cloud.
Figure 12. Connection information for IBM MQ
-
Do the following steps to view the list of available queue names:
- On the Queue managers tab of IBM MQ, click the account for which you need to view the available list of queue names.
- On the Administration tab of the selected MQ account, click Launch MQ Console.
- On the MQ Console page, the Queues tab displays the available queues.
Figure 13. Available queues in the Queue manager
ⓘ Note: The queue name used in the Procedure section is DEVQUEUE_2
-
Collect the username and password for MQ connection.
- To view the username for IBM MQ connection, navigate to the Application credentials tab.
Figure 14. Identifying the username for MQ connection
- To view the password for the IBM MQ connection, click the more options button for the selected username, and then click Add new API key.
Figure 15. Identifying the password for MQ connection
Procedure
Do the following steps to add a data stream connection using IBM MQ:
-
Go to Manage > Data stream.
-
On the Data stream page, click Add data stream.
-
On the Connection page of the Integrate data stream wizard, do the following steps:
a. Click IBM MQ.
b. In the Name field, type a name for the data stream.
c. Optional. In the Summary field, type a description for the data stream.
d. Click Drag and drop file here or click to load connection info from json and go to the required folder to select the connection information file, and then click Open.
e. In the Queue name field, type the queue name for the MQ account. For more information on how to locate the queue name, see Figure 14 in the Prerequisites for creating IBM MQ connection section.
f. In the Username field, type the username of the MQ connection. For more information on how to locate the username, see Figure 15 in the Prerequisites for creating IBM MQ connection section.
g. In the Password field, type the API key for the MQ connection. For more information, see Figure 16 in the Prerequisites for creating IBM MQ connection section.
h. Click Verify connection to validate the connection and then click Next.
Figure 16. Creating a data stream connection using IBM MQ
-
On the Mapping page, do the following steps:
a. Optional: In the Paste your sample message field, type the sample JSON message. This helps you to map the fields to the correct JSON selector.
b. In the Required mapped pairs section, map the data source columns to the corresponding JSON selector.
c. Optional: Click Add mapped pair to map other data source fields to the corresponding JSON selector.
d. Click Next.
Figure 17. Data mapping for IBM MQ connection
-
On the Scheduling page, enter the timeframe for running the data stream, and then click Create.
Figure 18. Scheduling the timeframe for IBM MQ connection
Managing a data stream connection
Activating a data stream connection
After you create a data stream connection in IBM Process Mining, you must activate it to fetch data at the scheduled timeframe.
Use the following step to activate a data stream connection:
-
On the Data stream page, set the toggle button to Active.
Figure 19. Setting the data stream connection to Activate
Fetching data from data stream
The data stream that is created in IBM Process Mining fetches data at the scheduled interval. Hence, it is important to set the minimum values on the Scheduling page of the Integrate data stream wizard.
You can fetch data whenever required as well. To do so, click the Fetch data now button on the Data stream page.
Figure 20. Fetching the data through the created data stream connection
ⓘ Note: In the figure, imported indicates the total number of JSON messages that are fetched by the application and discarded indicates the total number of JSON messages that do not match the data source of the process.
Analyzing the data fetched from data stream
After fetching the data, IBM Process Mining automatically analyzes the data and indicates the status of the process. The following figure illustrates the differences when the fetch data is analyzed.
Figure 21. Analyzing the data stream connection
The data that is fetched through data streaming is stored as a .csv
file in Data source.
Figure 22. Identifying the data in the Data source page
For more information on managing the Data Source page, see Data Source.
Editing and deleting data stream
You can edit or delete a data stream only if it is deactivated.
Use the following steps to edit a data stream:
- On the Data stream page, set the toggle button to Inactive.
- Click the more options button (
), and then click Edit.
- In the Edit data stream integration dialog, update the required changes and then click Save.
Use the following steps to delete a data stream:
- On the Data stream page, set the toggle button to Inactive.
- Click the more options button (
), and then click Delete.
- In the message box, click Delete.