The optimization process generates weights, calculates
the matching thresholds, and creates sample pairs. Optimization is
performed on all data in the MDM database. This option is typically
used for Probabilistic Matching Engine (PME) configurations or for
configurations that are managed through the Master Data Management
configuration editor.
About this task
You can run optimization on a local data file, an MDM
database, or a PME database. If you optimize a local data file, the
data and format file must conform to the mpxdata input
specification by using the pipe character (|) as the delimiter. MDM
and PME databases must have derived data that is already stored in
the database.
Algorithms that use the False Positive Filter
(FPF) are not supported for optimization. If your custom configuration
includes the FPF, you must remove the filter through the algorithm
editor before optimizing.
Attention: If you are optimizing
a Probabilistic Matching Engine (PME) configuration, you must export
and deploy the configuration archive in the MDM operational server.
After deployment of your PME configuration, load the source data into
the MDM database and perform the evergreening process if necessary.
Evergreening must take place before you can run the optimization process.
Attention: If a patch release of the MDM software
includes changes to any algorithm functions, you must run the optimization
process to implement the changes to your matching algorithm.
Procedure
- Select .
- On the Project and Data Selection window,
select a Project and Optimization
type from the lists.
- If you selected local data file as your optimization type,
provide the following information.
- Select your Data file. The data
file must comply with the mpxdata input specification.
- Select your Config file. This
configuration file must comply with the mpxdata input specification.
- If you selected Probabilistic matching engine or Master
data service as your optimization type, select the Server where
the PME or MDM operational server is running. Click Next.
- On the Optimization Customization window,
provide information for the following fields or accept the defaults.
- False positive rate - specify the false
positive rate that is used to compute the Clerical Review and Auto-Link
thresholds.
- False negative rate - specify the false
negative rate that is used to compute the Clerical Review and Auto-Link
thresholds.
- Minimum distributed sample count - enter
the minimum distributed sample count to include in the optimization.
- Minimum matched sample count - enter the
minimum matched sample count to include in the optimization.
- Minimum sample pair score - specify the
minimum comparison score to use for the sample pairs. The minimum
range must be between 0.0 - 99.9 and must be less than or equal to
the maximum score.
- Maximum sample pair score - specify the
maximum score to use for sample pairs. The maximum range must be between
0.0 and 99.9. The score must be greater than or equal to the minimum
score.
- Maximum sample pair count - specify the
maximum number of pairs for each one-tenth score range in the final
sample pair list.
- Click Finish.
What to do next
Log in to the server where your project is located and
review your validation samples. After the process is complete, download
your sample pair file. The sample pairs file shows the member score
and raw comparison data. Use this file to validate the accuracy of
your matching algorithm configuration.
If you optimized a PME
algorithm, export your configuration to the MDM operational server
after optimization.