Master Data Management tutorial: Configure a 360-degree view

Take this tutorial to configure a 360-degree view of customers and explore these customers with the Master Data Management use case of the data fabric trial. The goal of this tutorial is to combine customer data with credit score data to resolve entities across the data and create a consolidated 360 view of customers, as well as to identify the highest value customers to target in the campaigns and determine the best rates to offer them.

Tech preview This is a technology preview and is not yet supported for use in production environments.

Quick start: If you did not already create the sample project for this tutorial, access the Master Data Management sample project in the Resource hub.

The story for the tutorial is that Golden Bank wants to run a campaign to offer lower mortgage rates. As a data engineer, you must use IBM Match 360 to set up, map, and model your data for a 360-degree view of the customer.

The following animated image provides a quick preview of what you’ll accomplish by the end of this tutorial. You will set up and add assets to master data, map the data asset attributes, publish the data model and run matching, publish the matched data to a catalog, and then explore and visualize the matched data. Click the image to view a larger image.

Animated image

Preview the tutorial

In this tutorial, you can complete the following tasks:

Watch Video Watch this video to preview the steps in this tutorial. There might be slight differences in the user interface shown in the video. The video is intended to be a companion to the written tutorial.

This video provides a visual method to learn the concepts and tasks in this documentation.





Tips for completing this tutorial
Here are some tips for successfully completing this tutorial.

Use the video picture-in-picture

Tip: Start the video, then as you scroll through the tutorial, the video moves to picture-in-picture mode. Close the video table of contents for the best experience with picture-in-picture. You can use picture-in-picture mode so you can follow the video as you complete the tasks in this tutorial. Click the timestamps for each task to follow along.

The following animated image shows how to use the video picture-in-picture and table of contents features:

How to use picture-in-picture and chapters

Get help in the community

If you need help with this tutorial, you can ask a question or find an answer in the Cloud Pak for Data Community discussion forum.

Set up your browser windows

For the optimal experience completing this tutorial, open Cloud Pak for Data in one browser window, and keep this tutorial page open in another browser window to switch easily between the two applications. Consider arranging the two browser windows side-by-side to make it easier to follow along.

Side-by-side tutorial and UI

Tip: If you encounter a guided tour while completing this tutorial in the user interface, click Maybe later.



Set up the prerequisites

Sign up for Cloud Pak for Data as a Service

You must sign up for Cloud Pak for Data as a Service and provision the necessary services for the Master Data Management use case.

  • If you have an existing Cloud Pak for Data as a Service account, then you can get started with this tutorial. If you have a Lite plan account, only one user per account can run this tutorial.
  • If you don't have a Cloud Pak for Data as a Service account yet, then sign up for a data fabric trial.

Video icon Watch the following video to learn about data fabric in Cloud Pak for Data.

This video provides a visual method to learn the concepts and tasks in this documentation.

Verify the necessary provisioned services

preview tutorial video To preview this task, watch the video beginning at 00:50.

Important: The Match 360 service is available in the Dallas region only. If necessary, switch to the Dallas region before continuing.

Follow these steps to verify or provision the necessary services.

  1. In Cloud Pak for Data, verify that you are in the Dallas region. If not, click the region drop down, and then select Dallas.

    Change region

  2. From the Navigation menu Navigation menu, choose Services > Service instances.

  3. Use the Product drop-down box to determine whether an IBM Match 360 with Watson service instance exists.

  4. If you need to create a IBM Match 360 service instance, click Add service.

    1. Select IBM Match 360 with Watson.

    2. For the region, select Dallas.

    3. Select the Lite plan.

    4. Optional: Type a name for your IBM Match 360 with Watson service instance.

    5. Click Create.

  5. Repeat these steps to verify or provision the following services:

    • IBM Knowledge Catalog
    • Cloud Object Storage

Checkpoint icon Check your progress

The following image shows the provisioned service instances:

Provisioned services

Create the sample project

preview tutorial video To preview this task, watch the video beginning at 01:29.

Follow these steps to create the sample project for this tutorial:

  1. Access the Master Data Management sample project in the Resource hub.

  2. Click Create project.

  3. If prompted to associate the project to a Cloud Object Storage instance, select a Cloud Object Storage instance from the list.

  4. Click Create.

  5. Wait for the project import to complete, and then click View new project to verify that the project and assets were created successfully.

    Note: If this occasion is your first time accessing a project, you see a guided tour asking if you want a tour of projects. For now, click Maybe later.
  6. Click the Assets tab to view the project's assets.

Note: You might see a guided tour showing the tutorials that are included with this use case. The links in the guided tour will open these tutorial instructions.

Checkpoint icon Check your progress

The following image shows the sample project. You are now ready to start the tutorial.

Sample project




Task 1: Create a catalog for the matched data

preview tutorial video To preview this task, watch the video beginning at 02:08.

You need a catalog for the master data and for access to the matched data. With the IBM Knowledge Catalog Lite plan, you can create two catalogs. If you already have two catalogs, you can use one of your existing catalogs and verify that you are an editor of the catalog that you wish to use.

Option 1: Use the default catalog

Follow these steps to verify that you have the appropriate access to use the default catalog:

  1. From the Navigation menu Navigation menu, choose Catalogs > View all catalogs

  2. Open the catalog that you wish to use for this tutorial.

  3. Click the Access control tab.

  4. Verify that your account has the Editor role. If your access is Viewer, then contact your administrator to request Editor access.

Option 2: Create a new catalog

Otherwise, follow these steps to create the catalog:

  1. On the Catalogs page, click Create Catalog.

  2. For the Name, copy and paste the catalog name exactly as shown with no leading or trailing spaces:

    Mortgage Approval Catalog
    
  3. Select Enforce data protection rules, confirm the selection, and accept the defaults for the other fields.

  4. Click Create to use the default settings. Your new catalog opens.

Checkpoint icon Check your progress

The following image shows your catalog. Now that you have a catalog, you can set up master data and add the data assets.

Catalog




Task 2: Set up and add assets to master data

preview tutorial video To preview this task, watch the video beginning at 02:48.

You must add all of the data assets that you want to consolidate to master data. The sources of data can be from sources that include your computer's hard disk or a data asset from a project or catalog.

  1. From the Navigation menu Navigation menu, choose Data > Master data.

  2. If you need to set up master data, click Set up master data and follow the steps to associate the required project and services with master data. Otherwise, click Go to configuration and continue to the next step.

    1. Select your Cloud Object Storage service, then click Next.

    2. Select your Master Data Management project, accept the default name for the configuration asset, and then click Next.

    3. Select your existing catalog, check the Enfoce data protection rules option, and then click Next.

    4. Accept the default workflow configuration name, and click Finish.

    5. Click Continue with configuration to complete the setup.

  3. Click Start with data assets.

  4. Click Add data.

  5. Insert all three of the data assets in the project:

    1. Select the Project tab.

    2. Select all three csv files, Campaign Prospects.csv, Customers.csv, and Experiancc.csv, and then click the Insert Data icon (Insert Data).

    3. Click Add data.

  6. Assign the Person record type to your data assets. Record Type provides information about the type of data that an asset contains. Each asset needs to have an assigned record type so that IBM Match 360 can find the part of the model that best fits the data.

    1. Select the checkbox for the three assets, Campaign Prospects.csv, Customers.csv, and Experiancc.csv, and click Set asset properties.

    2. In the Select data asset type drop-down list, select the Person data asset type.

    3. Click Save.

Checkpoint icon Check your progress

The following image shows the assets added to master data. Now that you set up master data and added the three data assets, you are ready to begin mapping the data asset attributes.

Assets added to master data




Task 3: Map the data asset attributes

preview tutorial video To preview this task, watch the video beginning at 03:22.

For IBM Match 360 to match all of your data, you must specify which columns of each data set are mapped to specific attributes that are understood by IBM Match 360. Follow these steps to map the data asset attributes.

  1. Click the Mapping tab, and then click Got it to begin mapping the columns of your data assets to the appropriate attributes.

  2. In the Asset list panel, select Campaign Prospects.csv.

  3. In the side panel, click Profile data. Profiling your data is a prerequisite to automatically mapping columns of your data to attributes of the IBM Match 360 data model. Profiling takes 2-5 minutes. A message displays Profiling complete when the data profiling is finished.

  4. When profiling is complete, click Automap asset to map columns of your data automatically.

  5. Refer to Table 1: Campaign Prospects.csv mapping to manually map all of the columns that have the status Not mapped or not mapped correctly according to the table. To map a column to an attribute, you can follow the example: map an existing attribute. To exclude a column, you can follow the example: exclude columns from mapping.

  6. When all of the columns in the asset have a status of either Mapped, Automapped, or Excluded, you see the option to Map next data asset.

  7. Repeat Steps 3-5 for the Customers.csv and Experiancc.csv assets. Use the respective tables to map the columns for these data assets to the IBM Match 360 data model as suggested in Table 2: Customers.csv suggested mapping and Table 3: Experiancc.csv suggested mapping. Refer to the examples that explain how to manually map individual attributes. You can either map a column to an existing attribute or exclude columns from mapping.

Example 1: Map an existing attribute

preview tutorial video To preview this task, watch the video beginning at 04:07.

This example explains how to map the legal_name.full_name column in the Campaign Prospects.csv data asset to the existing attribute legal_name.full_name - Legal name - Full name. IBM Match 360 provides some attributes that are commonly associated with customer records that you can choose to map the columns in your data set to.

  1. Click the column legal_name.full_name.

  2. From the Mapping targets panel, in the search field, type Legal name - Full name.

  3. Select Legal name - Full name from the list. The column displays as Mapped and Mapped to: Legal name - Full name.

You can repeat these steps to map other columns of your data assets to existing attributes that either you previously created or provided by IBM Match 360.

Example 2: Exclude columns from mapping

preview tutorial video To preview this task, watch the video beginning at 05:15.

This example explains how to exclude a column from the data asset mapping. You can exclude columns from the mapping if they are not useful to IBM Match 360 during the matching process or if you do not want to include them in your matched data output.

  1. Click the column that is named Source.

  2. Toggle the checkbox Exclude column. The column displays as Excluded.

You can repeat these steps to exclude other columns of your data assets.

Table 1. Campaign Prospects.csv suggested mapping

Column Target Method
Source Exclude this column from mapping Exclude column from mapping
ID Exclude this column from mapping Exclude column from mapping
birth_date.value Birth date Map an existing attribute
gender.value Gender Map an existing attribute
legal_name.full_name Legal name - Full name Map an existing attribute
mobile_telephone.phone_number Mobile telephone - Phone number Map an existing attribute
personal_email.email_id Personal email - Email address Map an existing attribute
Lead Quality Exclude this column from mapping Exclude column from mapping

Table 2. Customers.csv suggested mapping

Column Target Method
Customer Number Exclude this column from mapping Exclude column from mapping
NAME Legal name - Full name Map an existing attribute
COUNTRY Exclude this column from mapping Exclude column from mapping
STREET_ADDRESS Primary residence - Address line 1 Map an existing attribute
CITY Primary residence - City Map an existing attribute
STATE Primary residence - State/Province value Map an existing attribute
ZIP_CODE Primary residence - Postal code Map an existing attribute
EMAIL_ADDRESS Personal email - Email address Map an existing attribute
PHONE_NUMBER Home telephone - Phone number Map an existing attribute
GENDER Gender Map an existing attribute
CREDITCARD_NUMBER Exclude this column from mapping Exclude column from mapping

Table 3. Experiancc.csv suggested mapping

Column Target Method
source Exclude this column from mapping Exclude column from mapping
Experian_ID Exclude this column from mapping Exclude this column from mapping
birth_date.value Birth date Map an existing attribute
gender.value Gender Map an existing attribute
home_telephone.phone_number Home telephone - Phone number Map an existing attribute
legal_name.given_name Legal name - Given name Map an existing attribute
legal_name.last_name Legal name - Last name Map an existing attribute
mobile_telephone.phone_number Mobile telephone - Phone number Map an existing attribute
personal_email.email_id Personal email - Email address Map an existing attribute
primary_residence.address_line1 Primary residence - Address line 1 Map an existing attribute
primary_residence.address_line2 Primary residence - Address line 2 Map an existing attribute
primary_residence.city Primary residence - City Map an existing attribute
primary_residence.province_state Exclude this column from mapping Exclude column from mapping
primary_residence.zip_postal_code Primary residence - Postal code Map an existing attribute
Credit score Exclude this column from mapping Exclude column from mapping
CREDITCARD_NUMBER Exclude this column from mapping Exclude column from mapping

Checkpoint icon Check your progress

The following image shows all of the mapped data assets. Now that you mapped the attributes for all three data assets, you can publish the data model and run matching.

Master data with all assets mapped




Task 4: Publish the data model and run matching

Task 4a: Publish the data model and all data

preview tutorial video To preview this task, watch the video beginning at 05:51.

The data model is created after you map all of the columns from your data assets to attributes. Your published data model is used by IBM Match 360 to resolve single entities from all of your data sources. Follow these steps to publish the data model.

  1. After you map the last column of the last data set, you are prompted with options. Click Publish model. Alternatively, you can publish the model later using the Publish model icon Publish model. This option displays after you finish mapping all of the columns in your three data assets. Publishing your model takes up to 1 minute. You receive a notification when your data model is successfully published.

  2. Click the Publish data icon Publish data, then click Publish data to load the mapped data assets into the IBM Match 360 data model based on the mapping. The statuses of the assets change from Publishing data to Ready to match. The data takes 5-10 minutes to load into service.

Checkpoint icon Check your progress

The following image shows the data assets listed as loaded into service indicating that the data model was published successfully. Next, you can run matching.

Published data model

Task 4b: Complete matching setup and run matching

preview tutorial video To preview this task, watch the video beginning at 06:23.

IBM Match 360 uses your published data model to consolidate all of the records of your data sources into single entities to create a data asset with more complete records. Follow these steps to run matching:

  1. From the Master Data menu Match setup, select Matching setup Match setup.

  2. Select the Person entity type to customize how records get matched.

  3. Click the Matching settings tab, and then click Got it on the Attribute selection screen. Review the settings on the Attribute selection, Record selection, Algorithm tuning, and Attribute composition pages. For this tutorial, you can accept the default attributes that are already selected. Here you can choose attributes that can help distinguish records from each other like birth dates, email addresses, or phone numbers to help the matching algorithm.

  4. Click the Match results tab, and then click Run matching. You receive a notification when the matching process is complete and the matching results are displayed.

Checkpoint icon Check your progress

The following image shows the results after you ran matching. Now that you published the data model and ran matching, you are ready to publish the matched data to a catalog.

Match results




Task 5: Publish the matched data to a catalog

preview tutorial video To preview this task, watch the video beginning at 06:54.

Task 5a: Create a connection asset for IBM Match 360

To access the matched data in a project, you need to create a connection asset to IBM Match 360. The IBM Match 360 connection asset connects data that is matched with the IBM Match 360 service to a connected data asset. Follow these steps to create the connection asset.

  1. From the Navigation menu Navigation menu, choose Projects > View all projects

  2. Choose your Master Data Management sample project.

  3. On the Assets tab, click New asset > Connect to a data source.

  4. Select the IBM Match 360 connector, and click Next.

  5. Type the connection asset name, Match 360 Connection.

  6. Retrieve the CRN of your IBM Match 360 with Watson service instance:

    1. From the IBM Cloud console resource list page, click Analytics to expand the list of your service instances.

    2. In the Product column, click IBM Match 360 with Watson.

    3. In the details panel that opens, click the Copy to clipboard icon for the CRN of your selected IBM Match 360 with Watson service.

  7. In the Connection details, paste the CRN that corresponds with your IBM Match 360 with Watson service instance.

  8. Create an IBM Match 360 API key:

    1. From the IBM Cloud console, click Manage > Access (IAM).

    2. Click the API keys page.

    3. Click Create an IBM Cloud API key. If you have any existing API keys, the button may be labelled Create.

    4. Type a name and description.

    5. Click Create.

    6. Copy the API key.

    7. Download the API key for future use.

  9. Complete the API key field with the API key that you created.

  10. Click Create.

  11. If asked to confirm you want to create the connection without setting location and sovereignty, click Create.

Checkpoint icon Check your progress

The following image shows the Match 360 connection asset. Now you can create a connected data asset from this connection.

Connection asset

Task 5b: Import connected data asset

preview tutorial video To preview this task, watch the video beginning at 8:32.

Now use the IBM Match 360 connection to create a new connected data asset of your consolidated data from IBM Match 360. Follow these steps to create a connected data asset.

  1. Click Import assets.

  2. On the Import assets page, select Connected data.

  3. Select Match 360 connection > records > person > person_entity.

  4. Click Import.

Checkpoint icon Check your progress

The following image shows the connected data asset. Now that you created the connected data asset for your consolidated, matched data, you can publish that asset to a catalog.

Connected data asset

Task 5c: Publish the connected data asset to your catalog

preview tutorial video To preview this task, watch the video beginning at 8:55.

Follow these steps to publish the consolidated, matched data to that catalog.

  1. In your Master Data Management project, verify that you are on the Assets tab.

  2. Click the Overflow menu Open and close list of options for your connected data asset person_entity, and choose Publish to catalog.

    1. Select the Mortgage Approval Catalog (or your catalog name) from the list, and click Next.

    2. Optionally, select the option to Go to the catalog after publishing it, and click Next.

    3. Review the assets, and click Publish.

  3. View and update the asset in the catalog:

    1. If you are not in the catalog, then from the Navigation menu Navigation menu, choose Catalogs > View all catalogs., anc click the catalog that you published your connected data asset to.

    2. Click the person_entity connected data asset.

    3. Click the Edit name icon Edit name. type the name for your connected data asset, Golden Bank 360 View, and click Apply.

    4. Click the Asset tab to preview the data.

Checkpoint icon Check your progress

The following image shows the data asset in the catalog.

Asset in catalog

As a data engineer for Golden Bank, you successfully used IBM Match 360 to set up, map, and model your data for a 360-degree view of the customer. You then published the complete 360-degree view of your matched data to your catalog for others in your organization to access.




Task 6: Preview your matched data

preview tutorial video To preview this task, watch the video beginning at 09:28.

Now that you published your model or data changes to IBM Match 360, set your matching parameters, and run matching, you can use master data explorer to query your matched data. The master data explorer empowers you to find, view, compare, and edit matching results. Now, as a data analyst for Golden Bank, you must analyze, explore, and validate IBM Match 360 results to identify and select the best qualifying customers to target formarketing campaign offers. Follow these steps to explore and tune your matched data.

  1. From the Navigation menu Navigation menu, choose Data > Master data.

  2. From the Master Data menu Match setup, select Search Search.

  3. In the search bar, type Branden Banks, and press Enter to add Branden Banks as a search criteria. For this search query, 2 entities appear for Branden Banks. The number 2 in the first column indicates that two source records that make up this entity and the number 1 in the first column means that one source record makes up the other entity.

  4. Expand both of the entities. You can see that these separate entities for Branden Banks is likely just one person. To join these entities into a single entity, you can tune the matching algorithm.

Checkpoint icon Check your progress

The following image shows the search results in Master data explorer. Next, you can tune the matching algorithm and run matching again.

Explore master data




Task 7: Tune matching algorithm and run matching

preview tutorial video To preview this task, watch the video beginning at 10:09.

After exploring the matched data, it is sometimes necessary to fine tune the matching algorithm and run matching again to obtain better results.

  1. From the Master Data menu Match setup, select Matching setup Match setup.

  2. Select the Person entity type to customize how records get matched.

  3. Click the Matching settings tab, and then click Got it on the Attribute selection screen.

  4. Click the Algorithm tuning page.

  5. Click the Match results tab, and then click Run matching. You receive a notification when the matching process is complete and the matching results are displayed.

  6. Click the Master data explorer drop-down, and select Matching setup from the menu.

  7. Click the Matching Settings tab, and then select the Algorithm tuning page.

  8. Toggle the Clerical range is enabled field.

  9. In the Clerical review threshold field, type 10. Scores below this threshold do not results in a match.

  10. In the Autolink threshold field, type 20. Reducing the threshold to 20 results in more overall matches between records across your sources. Scores between the clerical and autolink threshold generate a clerical review task.

  11. Click Apply threshold > Next > Run matching to run matching with your tuned algorithm.

  12. Click the Match results tab. The results are displayed when matching is finished.

Checkpoint icon Check your progress

The following image shows the results of matching setup. Next, you can view the matched data again to see how the fine tuning changed the results.

Matching setup tab




Task 8: Gain insight on the matching results

preview tutorial video To preview this task, watch the video beginning at 10:45.

You can return to the master data explorer to see how algorithm tuning changed your match results.

  1. From the Master Data menu Match setup, select Search Search.

  2. In the search bar, type Branden Banks, and press Enter to add Branden Banks as a search criteria. The number 3 associated with the entity that is displayed means that three records make up the entity Branden Banks wheras before it was split up across separate entities.

  3. Expand the row in the first column of the entity to view the records. You can see the three records that were matched to this entity.

Checkpoint icon Check your progress

The following image shows the search results in Master data explorer. Next, you can gain insight by visualizing the matching results.

Explore master data




Task 9: Visualize records of entities

preview tutorial video To preview this task, watch the video beginning at 11:11.

You can also visualize your tuned matching results as nodes to gain insights.

  1. Click Show graph to see which records are contributing to queried entities.

  2. Click any of the nodes that are connected to the person entity to view the details associated with it. From here, you can visualize and manually modify which records are associated which each entity from your query to make corrections as needed.

Checkpoint icon Check your progress

The following image shows the search results as a graph.

Explore graph



As a data analyst, you analyzed, explored, and validated IBM Match 360 results to identify and select the best qualifying customers to target for marketing campaign offers.

Cleanup (Optional)

If you would like to retake the tutorials in the Master Data Management use case, delete the following artifacts.

Artifact How to delete
Mortgage Approval Catalog Delete a catalog
Master Data Management sample project Delete a project

Next steps

Learn more

Parent topic: Use case tutorials