MULT_UNCERT comparison

Compares all words in one column of a record with all words in the same column of a second record by using a string comparison algorithm based on information theory principles.

Required Columns

The following data source and reference source columns are required:

  • Data. The character string from the data source.
  • Reference. The character string from the reference source.

Required Parameter

The following parameter is required:

Param 1. The cutoff threshold is a number between 0 and 900.
  • 900. The two strings are identical.
  • 850. The two strings can be safely considered to be the same.
  • 800. The two strings are probably the same.
  • 750. The two strings are probably different.
  • 700. The two strings are almost certainly different.

A higher Param 1 value causes the match to tolerate fewer differences than it would with a lower Param 1 value.

Example

The assigned weight is proportioned linearly between the agreement and disagreement weights. For example, if you specify 700 and the score is 700 or less, then the full disagreement weight is assigned. If the strings agree exactly, the full agreement weight is assigned.

As another example, suppose you specify 850 for the MatchParm, which means that the tolerance is relatively low. A score of 800 would get the full disagreement weight because it is lower than the parameter that you specified. Even though a score of 800 means that the strings are probably the same, you require a low tolerance.

Example

The following examples show that the MULT_UNCERT comparison is the best choice to match these addresses.

Building 5 Apartment 4B
Apartment 4-B Building 5