Topic
  • 4 replies
  • Latest Post - ‏2008-11-21T15:07:05Z by SystemAdmin
SystemAdmin
SystemAdmin
533 Posts

Pinned topic Question about to create a match specification for Unduplicate Match

‏2008-11-15T18:10:14Z |
Hi All,

I want to find matching records, clerical, dups, and nonmatching records within the table based on LAST_NAME, FIRST_NAME, MIDDLE_INITIAL, GENDER, DOB, PHONENO, ADDRESS.

I have used standardize and match frequency, then i have used standardize out and match frequency to Unduplicate Match Stage.

standardize ruleset :
USNAME.SET :- LAST_NAME, FIRST_NAME, MIDDLE_NAME, GENDER
VDATE.SET :- BIRTH_DATE
VPHONE.SET :- PHONE_NO
USADDR.SET:- ADDRESS

I have to use a match specification for Unduplicate Match Stage, in new match specification i did selected the MatchType = Unduplicate.

Then how to add details in Match Specification and where to add above columns to check the matches, how to use unduplicate match stage ??

My requirement is, if two records are same with above 7 column - it should show as 1 matched record and 1 duplicate record.If there in only 1 letter change or misspelt in firstname, it should show as 1 matched record and 1 clerical record.

please advise me how to finish the job ??

Raja
Updated on 2008-11-21T15:07:05Z at 2008-11-21T15:07:05Z by SystemAdmin
  • SystemAdmin
    SystemAdmin
    533 Posts

    Re: Question about to create a match specification for Unduplicate Match

    ‏2008-11-17T15:30:53Z  
    Raja,
    your questions are very broad.

    Please review the documentation for assistance.

    The documentation can be found on
    http://www-01.ibm.com/support/docview.wss?uid=swg27009462&rs=14

    From this page scroll down to QualityStage then select the link next to "WebSphere QualityStage User Guide". You will need to focus on chapters 10 and 11, which focus on the matching concepts.

    These documents also reside in the documentation folder for Information Server.
  • SystemAdmin
    SystemAdmin
    533 Posts

    Re: Question about to create a match specification

    ‏2008-11-20T15:00:02Z  
    Raja,
    your questions are very broad.

    Please review the documentation for assistance.

    The documentation can be found on
    http://www-01.ibm.com/support/docview.wss?uid=swg27009462&rs=14

    From this page scroll down to QualityStage then select the link next to "WebSphere QualityStage User Guide". You will need to focus on chapters 10 and 11, which focus on the matching concepts.

    These documents also reside in the documentation folder for Information Server.
    Hi,

    Okay - I will ask a small question.

    I have two records, where all columns data same except first names (ADAMS, ADAM).

    I have used FIRSTNAME in blocking and matching section.

    I am getting an output as 1 matched record and one duplicate record. But I want output as 1 matched record and 1 clerical record.

    do you have any idea..how to get this output or how to use FIRSTNAME columns in matchsection/do i need to mention any cutt off values to get the output.

    Thanks,
  • SystemAdmin
    SystemAdmin
    533 Posts

    Re: Question about to create a match specification

    ‏2008-11-20T15:00:47Z  
    Hi,

    Okay - I will ask a small question.

    I have two records, where all columns data same except first names (ADAMS, ADAM).

    I have used FIRSTNAME in blocking and matching section.

    I am getting an output as 1 matched record and one duplicate record. But I want output as 1 matched record and 1 clerical record.

    do you have any idea..how to get this output or how to use FIRSTNAME columns in matchsection/do i need to mention any cutt off values to get the output.

    Thanks,
    Ray - please give me a reply to this if you have time.

    Thanks,
  • SystemAdmin
    SystemAdmin
    533 Posts

    Re: Question about to create a match specification

    ‏2008-11-21T15:07:05Z  
    Hi,

    Okay - I will ask a small question.

    I have two records, where all columns data same except first names (ADAMS, ADAM).

    I have used FIRSTNAME in blocking and matching section.

    I am getting an output as 1 matched record and one duplicate record. But I want output as 1 matched record and 1 clerical record.

    do you have any idea..how to get this output or how to use FIRSTNAME columns in matchsection/do i need to mention any cutt off values to get the output.

    Thanks,
    Hi,
    You will have to modify the cutoff-threshold for matching records(duplicates) to be above the match-weight of the differing record and the cutoff value for clericals to be below the match-weight.
    This is easy for a single record, but it is recommendable to use more example-records to calibrate the thresholds according to Your wishes. You will have to recalibrate when you change blocking- and matching-parameters.

    By the way: If You used FIRSTNAME in the blocking-section You should never get a match on records with differing firstnames. Blocking should always give You sets of records with identical blocking attributes. Columns like FIRSTNAME may easily contain records with misspellings. QualityStage should evaluate 'ADAM' and the misspelled version 'ADAMS' in different blocks. Therefore the two records should not result in a duplicate or clerical pair, but as residuals!

    I prefer to use columns like NYSIIS-derivations of names for blocking.

    regards

    Roland