Testing passes for a match specification in DataStage®

You can test match passes to identify how effectively they meet your matching goals and make adjustments as necessary.

Testing applications

After you define one or more match passes and configure the test environment, you run test passes. Test results are displayed as statistics, data grids, charts, and other analytic tools. Evaluate these test results to decide whether you need to modify cutoffs, blocking columns, or match criteria. Also, you might determine that additional match passes are needed.

For example, your plan might be to include a match pass that matches on national identification number followed by a match pass that matches on the date of the birth. When you review the results of the first match pass, you determine that your matching goals are achieved, and that a second match pass is not needed. Alternatively, the results might indicate that you need to match on a different column for the second pass. Testing each match pass helps you choose the appropriate sequence of activities.

Testing

  1. Click Test and select Test specification to test the entire specification or Test selected pass to test only the selected pass. Passes calculate weights for each record based on your matching criteria and output matches based on your cutoff values. Adjust the cutoff values in the pass settings to determine how much weight qualifies a record as a match.
  2. You can size, sort, select, and group the test results by clicking the output table. Select multiple records in the run output and click Compare weights to view match commands and record agreement with chosen values. If your match comparison type is one-source, more options for grouping and displaying results are available. Click the settings adjust icon (two horizontal lines) to view these options.
  3. To view statistics from testing a pass, hover over a pass, click the three-dot-icon, and select View pass statistics. This option opens a graphical illustration of the pass statistics. Specify the display options to select the chart style and displayed data. The results of the most recent three runs are saved and can be selected under Baseline run. You can also manually save the statistics of the current pass to keep it permanently accessible as a baseline run by clicking the save button.
  4. To view combined statistics from all the passes that you have tested so far, click View total statistics and specify display options.