Topic
  • 4 replies
  • Latest Post - ‏2016-09-08T07:47:37Z by spiggle
andrelie
andrelie
40 Posts

Pinned topic CXNM Comparison Function

‏2015-09-14T04:51:55Z | mdm-migration

Hi

 

I am using CXNM for business name comparison.  In the wgtsval there are _NORM_ADJWGT_0 up to _NORM_ADJWGT_15.  The weight for _NORM_ADJWGT_1 = 183 while _NORM_ADJWGT_15 = 44

I wonder why _NORM_ADJWGT_15 has the least weight as this causes exact match will get the lowest score than partial or different match.  Should I edit these parameters' values so that _NORM_ADJWGT_1 has the least weight instead?

 

Thanks

Andre

 

 

 

 

  • YinleZhou
    YinleZhou
    1 Post

    Re: CXNM Comparison Function

    ‏2015-09-14T20:33:27Z  

    Hi Andre,

    Yeah, it should be the higher NORM_ADJWGT_number, it is closer to exact match. We need figure out what caused this reversed behavior. Do you see this before? Just CXNM is weird? What are attributes in your algorithm? 

     

    Thanks,

    Yinle

  • andrelie
    andrelie
    40 Posts

    Re: CXNM Comparison Function

    ‏2015-09-15T02:28:12Z  

    Hi Yinle

     

    This happens in ORG algorithm.  I have not seen it elsewhere.  I use Org Legal Name, Org Business Address, Org Business Phone, Org Corp Tax ID, Org Established Date among other additional attributes. Is it fine to hand edit the values i.e. put the _NORM_ADJWGT_1 value as _NORM_ADJWGT_15 instead?

     

    Regards

     

     

  • KaranBal
    KaranBal
    227 Posts

    Re: CXNM Comparison Function

    ‏2015-10-02T11:52:35Z  
    • andrelie
    • ‏2015-09-15T02:28:12Z

    Hi Yinle

     

    This happens in ORG algorithm.  I have not seen it elsewhere.  I use Org Legal Name, Org Business Address, Org Business Phone, Org Corp Tax ID, Org Established Date among other additional attributes. Is it fine to hand edit the values i.e. put the _NORM_ADJWGT_1 value as _NORM_ADJWGT_15 instead?

     

    Regards

     

     

    Weights affect the quality of matching. You can edit it but it will have impact on your matching. You can open a PMR for review if you like but edited is fine as long as it's edited correctly in a  way you want.

  • spiggle
    spiggle
    7 Posts

    Re: CXNM Comparison Function

    ‏2016-09-08T07:47:37Z  

    You are correct in that 15 should be best match score, and so on. Generated weights for CXNM are often fickle. 

    I find that the CXNM weights often lack any granularity but yours being in reverse seem a worst case scenario. 

    The weight generation is built on how distributed data is and also how important it is to matching. So gender is a great example in that it scores very low because it has such a low distribution of values, yet it has a relatively high disagreement score as when weightgen does its sampling it can see that nearly all matches, also match on gender. That is, two people are male 'not a big deal', two people have different genders 'big deal'.

    The problem with Organisation names is small variations can make a big difference. John Smith compared to John Paul Smith would give a good match for an individual, but CitiGroup Bank compared to CitiGroup Institutional Bank would not be a great match for an organisation - certainly not as good as the John Smith match. Additional words missing in one organisation name can make a significant difference in reality, and in the algorithm.

    If your organisational data has a lot of 'noise' in their names then often pairs that to you and I would appear a good match, are a bad match from an algorithm perspective, unless you've anonymised all the 'noise' which is not always easy to do.

    This results in pairs that score well overall, often scoring badly on CXNM, which in turn leads weightgen to think that the CXNM is not important to overall match. And this confuses the final weights as you've seen.

    All of the above is conjecture on my part :)

    What I've done in the past is remove everything else from the algorithm except for the name comparison, then generate the weights and save them. 

    Then restore the algorithm back, generate the full weights and hand edit your saved CXNM weights back in.