Data intelligence tutorial: Configure a 360-degree view
Take this tutorial to configure a 360-degree view of customers and explore these customers with the Data intelligence use case of the data fabric trial. The goal of this tutorial is to combine customer data with credit score data to resolve entities across the data and create a consolidated 360 view of customers, as well as to identify the highest value customers to target in the campaigns and determine the best rates to offer them.
The story for the tutorial is that Golden Bank wants to run a campaign to offer lower mortgage rates. As a data engineer, you must use IBM Match 360 to set up, map, and model your data for a 360-degree view of the customer.
The following animated image provides a quick preview of what you’ll accomplish by the end of this tutorial. You will set up and add assets to master data, map the data asset attributes, publish the data model and run matching, publish the matched data to a catalog, and then explore and visualize the matched data. Right-click the image and open it in a new tab to view a larger image.
Preview the tutorial
In this tutorial, you can complete the following tasks:
- Set up the prerequisites
- Task 1: Create a catalog for the matched data
- Task 2: Set up and add assets to master data
- Task 3: Map the data asset attributes
- Task 4: Publish the data model and run matching
- Task 5: Publish the matched data to a catalog
- Task 6: Preview your matched data
- Task 7: Tune matching algorithm and run matching
- Task 8: Gain insight on the matching results
- Task 9: Visualize records of entities
- Cleanup (Optional)
Watch this video to preview the steps in this tutorial. There might be slight differences in the user interface shown in the video. The video
is intended to be a companion to the written tutorial.
This video provides a visual method to learn the concepts and tasks in this documentation.
Try the tutorial
Expand each section to complete the task.
Tips for completing this tutorial
Here are some tips for successfully completing this tutorial.
Get help in the community
If you need help with this tutorial, you can ask a question or find an answer in the Cloud Pak for Data Community discussion forum.
Set up your browser windows
For the optimal experience completing this tutorial, open Cloud Pak for Data in one browser window, and keep this tutorial page open in another browser window to switch easily between the two applications. Consider arranging the two browser windows side-by-side to make it easier to follow along.
Set up the prerequisites
Base Premium Standard Unless otherwise noted, this information applies to all editions of IBM Knowledge Catalog.
The following prerequisites are required to complete this tutorial.
Access type | Description | Documentation |
---|---|---|
Services | - IBM Match 360 -IBM Knowledge Catalog |
- IBM Match 360 - IBM Knowledge Catalog |
Match 360 access | Data Engineer role | Giving users access to IBM Match 360 |
Roles and Permissions | - Data Engineer role - Access catalogs permission |
- Manage roles - Predefined roles and permissions |
Additional access | Editor access to Default Catalog (Optional) | Add collaborators |
Additional configuration | Disable Enforce the exclusive use of secrets | Require users to use secrets for credentials |
Follow these steps to verify your roles and permissions. If your Cloud Pak for Data account does not meet all of the prerequisites, contact your administrator.
-
Click your profile image in the toolbar.
-
Click Profile and settings.
-
Select the Roles tab.
The permissions that are associated with your role (or roles) are listed in the Enabled permissions column. If you are a member of any user groups, you inherit the roles that are assigned to that group. These roles are also displayed
on the Roles tab, and the group from which you inherit the role is specified in the User groups column. If the User groups column shows a dash, that means the role is assigned directly to you.
Create the sample project
Follow these steps to create the sample project for this tutorial:
-
Download the Master-Data-Management.zip file.
-
From the Navigation menu
, choose Projects > All projects.
-
On the Projects page, click New project.
-
Select Local file.
-
Upload the previously downloaded ZIP file.
-
On the Create a project page, type the project name,
Master Data Management
, and optional description for the project. -
Click Create.
-
Click View new project to verify that the project and assets were created successfully.
-
Click the Assets tab to verify that the project and assets were created successfully.
Check your progress
The following image shows the sample project. You are now ready to start the tutorial.
Task 1: Create a catalog for the matched data
You need a catalog for the master data and for access to the matched data. You can use an existing catalog and verify that you are an editor of the catalog you wish to use.
Option 1: Use the default catalog
Follow these steps to verify that you have the appropriate access to use the default catalog:
-
From the Navigation menu
, choose Catalogs > All catalogs.
-
Open the catalog that you wish to use for this tutorial.
-
Click the Access control tab.
-
Verify that your account has the Editor role. If your access is Viewer, then contact your administrator to request Editor access.
Option 2: Create a new catalog
Otherwise, if you have the appropriate role and permissions to create a catalog, you can follow these steps to create a new catalog.
-
On the Catalogs page, click Create Catalog.
-
For the Name, copy and paste the catalog name exactly as shown with no leading or trailing spaces:
Mortgage Approval Catalog
-
Select Enforce data protection rules, confirm the selection, and accept the defaults for the other fields.
-
Click Create to use the default settings. Your new catalog opens.
Check your progress
The following image shows your catalog. Now that you have a catalog, you can set up master data and add the data assets.
Task 2: Set up and add assets to master data
You must add all of the data assets that you want to consolidate to master data. The sources of data can be from sources that include your computer's hard disk or a data asset from a project or catalog.
-
From the Navigation menu
, choose Data > Master data.
-
If you need to set up master data, click Set up master data and follow the steps to associate the required project and services with master data. Otherwise, click Go to configuration and continue to the next step.
-
Select your Master Data Management project, accept the default name for the configuration asset, and then click Next.
-
Select your existing catalog, check the Enfoce data protection rules option, and then click Next.
-
Accept the default workflow configuration name, and click Finish.
-
Click Continue with configuration to complete the setup.
-
-
Click Add data assets.
-
Click Add data.
-
Insert all three of the data assets in the project:
-
Select the Project tab.
-
Select all three csv files, Campaign Prospects.csv, Customers.csv, and Experiancc.csv, and then click the Insert Data icon (
).
-
Click Add data.
-
-
Assign the Person record type to your data assets. Record Type provides information about the type of data that an asset contains. Each asset needs to have an assigned record type so that IBM Match 360 can find the part of the model that best fits the data.
-
Select the checkbox for the three assets, Campaign Prospects.csv, Customers.csv, and Experiancc.csv, and click Set asset properties.
-
In the Select data asset type drop-down list, select the Person data asset type.
-
Click Save.
-
Check your progress
The following image shows the assets added to master data. Now that you set up master data and added the three data assets, you are ready to begin mapping the data asset attributes.
Task 3: Map the data asset attributes
For IBM Match 360 to match all of your data, you must specify which columns of each data set are mapped to specific attributes that are understood by IBM Match 360. Follow these steps to map the data asset attributes.
-
Click the Mapping tab, and then click Got it to begin mapping the columns of your data assets to the appropriate attributes.
-
In the Asset list panel, select Campaign Prospects.csv.
-
In the side panel, click Profile data. Profiling your data is a prerequisite to automatically mapping columns of your data to attributes of the IBM Match 360 data model. Profiling takes 2-5 minutes. A message displays Profiling complete when the data profiling is finished.
-
When profiling is complete, click Automap asset to map columns of your data automatically.
-
Refer to Table 1: Campaign Prospects.csv mapping to manually map all of the columns that have the status Not mapped or not mapped correctly according to the table. To map a column to an attribute, you can follow the example: map an existing attribute. To exclude a column, you can follow the example: exclude columns from mapping.
-
When all of the columns in the asset have a status of either Mapped, Automapped, or Excluded, you see the option to Map next data asset.
-
Repeat Steps 3-5 for the Customers.csv and Experiancc.csv assets. Use the respective tables to map the columns for these data assets to the IBM Match 360 data model as suggested in Table 2: Customers.csv suggested mapping and Table 3: Experiancc.csv suggested mapping. Refer to the examples that explain how to manually map individual attributes. You can either map a column to an existing attribute or exclude columns from mapping.
Example 1: Map an existing attribute
This example explains how to map the legal_name.full_name column in the Campaign Prospects.csv data asset to the existing attribute legal_name.full_name - Legal name - Full name. IBM Match 360 provides some attributes that are commonly associated with customer records that you can choose to map the columns in your data set to.
-
Click the column legal_name.full_name.
-
From the Mapping targets panel, in the search field, type
Legal name - Full name
. -
Select Legal name - Full name from the list. The column displays as Mapped and Mapped to: Legal name - Full name.
You can repeat these steps to map other columns of your data assets to existing attributes that either you previously created or provided by IBM Match 360.
Example 2: Exclude columns from mapping
This example explains how to exclude a column from the data asset mapping. You can exclude columns from the mapping if they are not useful to IBM Match 360 during the matching process or if you do not want to include them in your matched data output.
-
Click the column that is named Source.
-
Toggle the checkbox Exclude column. The column displays as Excluded.
You can repeat these steps to exclude other columns of your data assets.
Table 1. Campaign Prospects.csv suggested mapping
Column | Target | Method |
---|---|---|
Source | Exclude this column from mapping | Exclude column from mapping |
ID | Exclude this column from mapping | Exclude column from mapping |
birth_date.value | Birth date | Map an existing attribute |
gender.value | Gender | Map an existing attribute |
legal_name.full_name | Legal name - Full name | Map an existing attribute |
mobile_telephone.phone_number | Mobile telephone - Phone number | Map an existing attribute |
personal_email.email_id | Personal email - Email address | Map an existing attribute |
Lead Quality | Exclude this column from mapping | Exclude column from mapping |
Table 2. Customers.csv suggested mapping
Column | Target | Method |
---|---|---|
Customer Number | Exclude this column from mapping | Exclude column from mapping |
NAME | Legal name - Full name | Map an existing attribute |
COUNTRY | Exclude this column from mapping | Exclude column from mapping |
STREET_ADDRESS | Primary residence - Address line 1 | Map an existing attribute |
CITY | Primary residence - City | Map an existing attribute |
STATE | Primary residence - State/Province value | Map an existing attribute |
ZIP_CODE | Primary residence - Postal code | Map an existing attribute |
EMAIL_ADDRESS | Personal email - Email address | Map an existing attribute |
PHONE_NUMBER | Home telephone - Phone number | Map an existing attribute |
GENDER | Gender | Map an existing attribute |
CREDITCARD_NUMBER | Exclude this column from mapping | Exclude column from mapping |
Table 3. Experiancc.csv suggested mapping
Column | Target | Method |
---|---|---|
source | Exclude this column from mapping | Exclude column from mapping |
Experian_ID | Exclude this column from mapping | Exclude this column from mapping |
birth_date.value | Birth date | Map an existing attribute |
gender.value | Gender | Map an existing attribute |
home_telephone.phone_number | Home telephone - Phone number | Map an existing attribute |
legal_name.given_name | Legal name - Given name | Map an existing attribute |
legal_name.last_name | Legal name - Last name | Map an existing attribute |
mobile_telephone.phone_number | Mobile telephone - Phone number | Map an existing attribute |
personal_email.email_id | Personal email - Email address | Map an existing attribute |
primary_residence.address_line1 | Primary residence - Address line 1 | Map an existing attribute |
primary_residence.address_line2 | Primary residence - Address line 2 | Map an existing attribute |
primary_residence.city | Primary residence - City | Map an existing attribute |
primary_residence.province_state | Exclude this column from mapping | Exclude column from mapping |
primary_residence.zip_postal_code | Primary residence - Postal code | Map an existing attribute |
Credit score | Exclude this column from mapping | Exclude column from mapping |
CREDITCARD_NUMBER | Exclude this column from mapping | Exclude column from mapping |
Check your progress
The following image shows all of the mapped data assets. Now that you mapped the attributes for all three data assets, you can publish the data model and run matching.
Task 4: Publish the data model and run matching
Task 4a: Publish the data model and all data
The data model is created after you map all of the columns from your data assets to attributes. Your published data model is used by IBM Match 360 to resolve single entities from all of your data sources. Follow these steps to publish the data model.
-
After you map the last column of the last data set, you are prompted with options. Click Publish model. Alternatively, you can publish the model later using the Publish model icon
. This option displays after you finish mapping all of the columns in your three data assets. Publishing your model takes up to 1 minute. You receive a notification when your data model is successfully published.
-
Click the Publish data icon
, then click Publish data to load the mapped data assets into the IBM Match 360 data model based on the mapping. The statuses of the assets change from Publishing data to Ready to match. The data takes 5-10 minutes to load into service.
Check your progress
The following image shows the data assets listed as loaded into service indicating that the data model was published successfully. Next, you can run matching.
Task 4b: Complete matching setup and run matching
IBM Match 360 uses your published data model to consolidate all of the records of your data sources into single entities to create a data asset with more complete records. Follow these steps to run matching:
-
From the Master Data menu
, select Matching setup
.
-
Select the Person entity type to customize how records get matched.
-
Click the Matching settings tab, and then click Got it on the Attribute selection screen. Review the settings on the Attribute selection, Record selection, Algorithm tuning, and Attribute composition pages. For this tutorial, you can accept the default attributes that are already selected. Here you can choose attributes that can help distinguish records from each other like birth dates, email addresses, or phone numbers to help the matching algorithm.
-
Click the Match results tab, and then click Run matching. You receive a notification when the matching process is complete and the matching results are displayed.
Check your progress
The following image shows the results after you ran matching. Now that you published the data model and ran matching, you are ready to publish the matched data to a catalog.
Task 5: Publish the matched data to a catalog
Task 5a: Create a connection asset for IBM Match 360
To access the matched data in a project, you need to create a connection asset to IBM Match 360. The IBM Match 360 connection asset connects data that is matched with the IBM Match 360 service to a connected data asset. Follow these steps to create the connection asset.
-
From the Navigation menu
, choose Projects > All projects.
-
Choose your Master Data Management sample project.
-
On the Assets tab, click New asset > Connect to a data source.
-
Select the IBM Match 360 connector, and click Next.
-
Type the connection asset name,
Match 360 Connection
. -
Paste your Cloud Pak for Data host name in the Route host field.
-
Locate your IBM Match 360 instance ID. Open Cloud Pak for Data a new browser tab. From the Navigation menu
, choose Services > Instances.
-
Click your Match 360 service instance name.
-
In the browser URL, copy the text after 'mdm-'.
-
Return to the Create connection page, and paste the text into the IBM Match 360 instance ID field.
-
-
Return to the Match 360 service instance page to complete the API key field.
-
Click Instance API key > Generate API key.
-
Click Generate.
-
Click Copy.
-
Click Cancel to return to your Match 360 service instance page.
-
Return to the Create connection page, and paste the text into the API key field.
-
-
Paste your Cloud Pak for Data username in the Username field.
-
Click Create.
-
If asked to confirm you want to create the connection without setting location and sovereignty, click Create.
Check your progress
The following image shows the Match 360 connection asset. Now you can create a connected data asset from this connection.
Task 5b: Import connected data asset
Now use the IBM Match 360 connection to create a new connected data asset of your consolidated data from IBM Match 360. Follow these steps to create a connected data asset.
-
Click Import assets.
-
On the Import assets page, select Connected data.
-
Select Match 360 connection > records > person > person_entity.
-
Click Import.
Check your progress
The following image shows the connected data asset. Now that you created the connected data asset for your consolidated, matched data, you can publish that asset to a catalog.
Task 5c: Publish the connected data asset to your catalog
Follow these steps to publish the consolidated, matched data to that catalog.
-
In your Master Data Management project, verify that you are on the Assets tab.
-
Click the Overflow menu
for your connected data asset person_entity, and choose Publish to catalog.
-
Select the Default Catalog (or your catalog name) from the list, and click Next.
-
Optionally, select the option to Go to the catalog after publishing it, and click Next.
-
Review the assets, and click Publish.
-
-
View and update the asset in the catalog:
-
If you are not in the catalog, then from the Navigation menu
, choose Catalogs > All catalogs., anc click the catalog that you published your connected data asset to.
-
Click the person_entity connected data asset.
-
Click the Edit name icon
. type the name for your connected data asset,
Golden Bank 360 View
, and click Apply. -
Click the Asset tab to preview the data.
-
Check your progress
The following image shows the data asset in the catalog.
As a data engineer for Golden Bank, you successfully used IBM Match 360 to set up, map, and model your data for a 360-degree view of the customer. You then published the complete 360-degree view of your matched data to your catalog for others in your organization to access.
Task 6: Preview your matched data
Now that you published your model or data changes to IBM Match 360, set your matching parameters, and run matching, you can use master data explorer to query your matched data. The master data explorer empowers you to find, view, compare, and edit matching results. Now, as a data analyst for Golden Bank, you must analyze, explore, and validate IBM Match 360 results to identify and select the best qualifying customers to target formarketing campaign offers. Follow these steps to explore and tune your matched data.
-
From the Navigation menu
, choose Data > Master data.
-
From the Master Data menu
, select Search
.
-
In the search bar, type
Branden Banks
, and press Enter to add Branden Banks as a search criteria. For this search query, 2 entities appear for Branden Banks. The number 2 in the first column indicates that two source records that make up this entity and the number 1 in the first column means that one source record makes up the other entity. -
Expand both of the entities. You can see that these separate entities for Branden Banks is likely just one person. To join these entities into a single entity, you can tune the matching algorithm.
Check your progress
The following image shows the search results in Master data explorer. Next, you can tune the matching algorithm and run matching again.
Task 7: Tune matching algorithm and run matching
After exploring the matched data, it is sometimes necessary to fine tune the matching algorithm and run matching again to obtain better results.
-
From the Master Data menu
, select Matching setup
.
-
Select the Person entity type to customize how records get matched.
-
Click the Matching settings tab, and then click Got it on the Attribute selection screen.
-
Click the Algorithm tuning page.
-
Click the Match results tab, and then click Run matching. You receive a notification when the matching process is complete and the matching results are displayed.
-
Click the Master data explorer drop-down, and select Matching setup from the menu.
-
Click the Matching Settings tab, and then select the Algorithm tuning page.
-
Toggle the Clerical range is enabled field.
-
In the Clerical review threshold field, type
10
. Scores below this threshold do not results in a match. -
In the Autolink threshold field, type
20
. Reducing the threshold to 20 results in more overall matches between records across your sources. Scores between the clerical and autolink threshold generate a clerical review task. -
Click Apply threshold > Next > Run matching to run matching with your tuned algorithm.
-
Click the Match results tab. The results are displayed when matching is finished.
Check your progress
The following image shows the results of matching setup. Next, you can view the matched data again to see how the fine tuning changed the results.
Task 8: Gain insight on the matching results
You can return to the master data explorer to see how algorithm tuning changed your match results.
-
From the Master Data menu
, select Search
.
-
In the search bar, type
Branden Banks
, and press Enter to add Branden Banks as a search criteria. The number 3 associated with the entity that is displayed means that three records make up the entity Branden Banks wheras before it was split up across separate entities. -
Expand the row in the first column of the entity to view the records. You can see the three records that were matched to this entity.
Check your progress
The following image shows the search results in Master data explorer. Next, you can gain insight by visualizing the matching results.
Task 9: Visualize records of entities
You can also visualize your tuned matching results as nodes to gain insights.
-
Click Show graph to see which records are contributing to queried entities.
-
Click any of the nodes that are connected to the person entity to view the details associated with it. From here, you can visualize and manually modify which records are associated which each entity from your query to make corrections as needed.
Check your progress
The following image shows the search results as a graph.
As a data analyst, you analyzed, explored, and validated IBM Match 360 results to identify and select the best qualifying customers to target for marketing campaign offers.
Cleanup (Optional)
If you would like to retake the tutorial in this use case, delete the following artifacts.
Artifact | How to delete |
---|---|
Mortgage Approval Catalog | Delete a catalog |
Master Data Management sample project | Delete a project |
Next steps
-
Try these tutorials:
-
View another Data fabric use case.
Learn more
Parent topic: Use case tutorials