Running an overlap analysis
To identify overlapping or redundant data in columns, run an overlap analysis from a workspace. An overlap analysis compares the values between columns in one table or across tables.
About this task
You do not have to run a column analysis or data quality analysis on the data sets before you run an overlap analysis. However, the analysis runs faster on data sets that have been previously analyzed.
When selecting the data sets that you want to run an overlap analysis for, the list view is the easiest to work with if you want to run an analysis on all data sets in your workspace.
Procedure
- Open the workspace that contains the data sets that you want to identify overlaps for, select one or more data sets, and click Run relationship analysis.
- Select Overlap analysis. Then, click Analyze. Note: You can run a key relationship analysis at the same time that you run an overlap analysis. Check the status of the analysis by going to the Activity tile in your workspace and selecting Overlap analysis.
- Select Relationships at the top of the workspace. Then, select the data sets that you want to display.
- Select and review the relationships in the entity-relationship diagram or the grid.
What to do next
All overlaps that are marked as selected can be published by using InfoSphere® Information Analyzer.