Entity resolution

Entity resolution is the process that resolves entities from various records. The IBM FCI for Insurance Entity Resolution feature searches through party data to identify possible matches with variances in spelling and other properties. For example, James Smith and James Smythe might be the same party with different spellings in different records.

You can schedule the entity resolution batch job to run routinely to identify disparate records that represent the same party. Resolved entities are stored in the CFFACT.RESOLVED_ENTITY_REF table.

Determining whether entities match depends on the matching methods that you use and the thresholds that you set in the Entity Resolution system properties.

You can use one of more of the following available methods to generate the matching score:

Levenshtein Distance
This method calculates the edit distance in a field. For example, the number of insertions, substitutions, deletions that are required to turn the source string into the target string.
SOUNDEX
This method uses the DB2® function to identify strings for which the sound is known but the precise spelling is not. SOUNDEX makes assumptions about the way that letters and combinations of letters sound that can help to search for words with similar sounds.
Vowel Manipulation
This method removes vowels and substitutes like-sounding vowels to determine whether the fields match.
Hash
This method compares the hash values of fields. It is platform specific.
The matching score is calculated as the ratio of the number of fields that matched (Fields Matched) divided by the minimum number of fields that might match (Fields Checked). For example, suppose that Party A has one name field and Party B has five name fields. Only one field has the possibility of matching correctly.
Fields Matched/Fields Checked * 100

This score is then compared to the threshold that is set in the CF.MATCH.ScoreThreshold system property to determine whether the items match.

You can manually run the fci_batch_entity_resolution_job entity resolution batch job to identify disparate records that represent the same party. Each time that the batch job runs, only new party data entries are compared. That is, new data is compared to existing data and to other new data, but existing resolved entities remain in the CFFACT.RESOLVED_ENTITIES table and are not compared against existing data.

The IBM FCI for Insurance Entity Resolution feature is just one method for resolving identities. If this feature does not meet your business needs, you can use InfoSphere Identity Insight (ISII) to create a custom entity resolution solution.