Analyzing and comparing master data
Data stewards can analyze and compare entities and records in the Analysis workspace. You can open entities in a split view to directly compare their attribute values and, if necessary, update the linkages between records and entities to correct entity membership. You can also see key details such as similarity scores between records.
Data stewards need clarity about how entities are formed and how member records are selected. While investigating entity composition, data stewards often ask the following questions:
- Why is this record part of this entity?
- Why is this other record excluded from this entity?
- How similar are an entity's member records?
- How did this record become linked to this entity?
The Analysis tab in the master data workspace provides data stewards with all the tools necessary to investigate the composition of an entity so that they can answer these questions and ensure that master data entities are accurate.
Tip: You can control how entities and records are displayed in the Analysis tab by customizing your workspace's view settings. For information about configuring workspace view settings, see Defining how records, entities, and attributes are displayed.
Analyzing entities in the analysis tab
Before you can begin analyzing entities and their member records, you must add one or more entities to the analysis queue.
To analyze entities in the analysis queue:
-
From the Master data navigation menu, click Search
to open the master data search page.
-
Search for the entity that you want to work with. For more information, see Searching for master data. The results are shown in a table view by default.
-
Select one or more entities from the search results and click Add to analysis.
You can add as many entities as necessary to the analysis queue.
-
Click the Analysis tab.
Confirm that the entity or entities that you selected are added to the Analysis queue panel. There might be additional entities in the queue if you have previously added them. Remove an entity from the queue at any time by clicking the Remove (X) icon.
The format of the display name of each item in the analysis queue is determined by the workspace's view settings. For information about configuring workspace view settings, see Defining how records, entities, and attributes are displayed.
-
Select an entity from the analysis queue to view its details, including its attribute values and member records.
The list of member records always includes one center record. Center records are the basis of the entity, and cannot be unlinked or moved to a different entity. The oldest record in an entity is designated as the center record. Center records are indicated by a dot beside their row in the member records table.
-
Select a member record from the table to see its details, including its attribute values and an analysis of its similarity to other member records.
The similarity analysis provides similarity scores between the selected record and other member records in the entity. The similarity percentages show how similar the records are. This page also shows how each member record was linked to the entity.
-
To see a more detailed breakdown of the record-to-record similarity score, click the View similarity score details icon
. The similarity score details page provides key information about how the similarity score percentage was calculated, including what matching rules were considered and how strongly each rule was weighted in the matching decision.
The default similarity view is a graph that indicates positive (in green) and negative (in red) comparisons between the two records for each applicable matching rule. Vertical dotted lines on the graph indicate the minimum weight, maximum weight, and overall score.
To switch from the graph view to a table view, click the table view icon
. The table view shows key information about how the similarity score percentage was calculated, including what matching rules were considered and how strongly each rule was weighted in the matching decision.
To return to the graph view, click the graph view icon
.
-
To view the history of manual join or unlink actions completed for the selected entity, click Manual linkage history. Click View details to see information about who completed each manual linkage action and when it was done.
-
To copy the entity IDs of the entities in your analysis queue so that you can open them in the main master data workspace Search tab:
- Ensure that your analysis queue has the entities you want to copy and view elsewhere.
- Click the Copy to clipboard icon
.
- Open the Search tab and paste the entity IDs into the search bar. Press Enter.
- Explore and work with the entities as needed.
Understanding similarity scores
Similarity scores are calculated by using the IBM Master Data Management matching algorithm to determine how records should be matched into entities.
The IBM Master Data Management matching algorithm compares records by using various comparison functions for attributes such as names, addresses, birth dates, gender, and other commonly used identifying details. Every comparison function results in an absolute comparison score and a distance measure, which is a number between 0 and 10. Similarity is the reverse of the distance measure. A lower distance score correlates to higher similarity, and, therefore, a higher match likelihood. For more information about how IBM Master Data Management matches records, see Matching algorithms.
When records are matched by IBM Master Data Management matching operations, it is because they reach a certain threshold of similarity. Similarity scores are not perfect, so it is possible that some records can be erroneously matched to form imperfect entities.
Additionally, some records might be linked to an entity manually by a data steward. In these cases, the similarity scores might not have been considered or might have been overridden.
Comparing entities side-by-side
You can open two entities in a side-by-side split view to easily compare them based on their attribute values and member records. You can only view two entities at a time in a split view.
To open entities in a split view:
- Add both entities to the analysis queue, then click the Analysis tab of the master data workspace.
- In the Analysis queue panel, click the first entity you want to compare. The entity details panel opens.
- Hover over the second entity for this comparison, then click the split view icon
beside the entity in the analysis queue.
- Compare the attribute values and member records of both entities to identify differences and similarities.
Comparing record similarity
When analyzing the composition of an entity, you can compare two records to view details about their similarity scores.
See the detailed similarity scores for each member record to understand why it was matched into its entity, either automatically by the IBM Master Data Management matching algorithm or manually by a data steward user.
You can also compare two records from different entities to determine if the entities should be joined or if a record was improperly matched.
You can only compare the similarity of two records at a time.
To compare the similarity of two records that are members of the same entity:
- Add the entity to the analysis queue, then click the Analysis tab of the master data workspace.
- Click the entity you want to investigate. The entity details panel opens.
- Select exactly two member records that you want to compare, then click Compare.
- Review the details of the similarity score calculation between the selected records.
To compare the similarity of two records that are members of different entities:
- Add both entities to the analysis queue, then click the Analysis tab of the master data workspace.
- In the Analysis queue panel, click the first entity you want to compare. The entity details panel opens.
- Hover over the second entity for this comparison, then click the split view icon
beside the entity in the analysis queue.
- Select exactly two member records across the two entities that you want to compare, then click Compare.
- Review the details of the similarity score calculation between the selected records.
Editing record linkages
If you determine that one or more records should no longer be members of a given entity, you can change the makeup of the entity by editing linkages between the records and the entity. You can move records from an entity to another existing entity, create a new entity, or create separate singleton entities.
You cannot edit the linkage of an entity's center record. The center record is the basis of its entity. When you select the center record, the Edit linkage button is disabled.
To edit record linkages:
-
Add the entity to the analysis queue, then click the Analysis tab of the master data workspace.
-
In the Analysis queue panel, click the entity you want to edit. The entity details panel opens.
-
Select one or more member records that you want to edit the linkage for (not including the center record), then click Edit linkage.
-
Verify that you selected the correct records.
-
Select the type of linkage change you want to make for the selected records:
- Move to an existing entity: Separate the records from the current entity and add them to another existing entity. By default, up to 10 entities from the analysis queue are shown as options to move the records to.
- Create a new entity: Separate the records from the current entity and add them to a newly created entity.
- Create singleton entity for each record: Separate the records from the current entity and add them each individually to their own singleton entities. Singleton entities contain only one record.
-
If you chose to link to an existing entity, search for and select the entity that you want to move the records into.
-
Click Next.
-
Review the proposed linkage action. Click Preview to view a preview of what the data will look like after the linkage action is completed.
-
Add a note describing the reason for this linkage change. While optional, leaving a detailed note is helpful to later remind yourself or others about why this change is needed.
-
Click Update linkage to complete the change.