GNRMETA bucket generation function can be used for data sets containing names.
If your data set contains multicultural names, this function can be especially beneficial. In the InfoSphere® MDM Workbench algorithm editor, GNRMETA is identified as GNR & Phonetic.
GNRMETA calls out to IBM® InfoSphere Global Name Recognition (GNR) for name variants and corresponding percentages produced by the GNR analyze()method. The percentages are the frequencies of a particular variant in comparison to other variants. The variants are then filtered by using a percentage threshold setting. Only those variants that are greater than or equal to that percent are used in bucketing. The anonymous value (ANON) handling is done before the input values are sent to GNR.
GNRMETA is similar to using EQMETA with an equivalency string code (equistrcode) of NICKNAME. EQMETA, with NICKNAME, looks up the various nickname forms of a token and then passes it through the META function. With GNRMETA, the lookup is done with GNR instead of a NICKNAME table.
There are two derivation arguments (dvdArgs)
used with GNRMETA. The first is the phonetic function. The second
is the percentage threshold value, which is specified as percent=value.
The value must be an integer. For example,
GNRMETA can be used with existing InfoSphere MDM standardization, comparison, and phonetic functions. You must use either PXNM or BXNM bucketing functions.
If you use GNRMETA in your algorithm, you must run the Enable/Disable GNRMETA job (make sure the Enabled radio button is selected) before deploying your configuration. After running the job, you must restart the InfoSphere MDM operational server. You can restart your instance using the Suspend Server and Resume Server jobs.