mpxfsdvd utility
The mpxfsdvd utility enables the creation of bulk cross match (BXM) files from .unl files.
This utility is a data derivation method that uses pre-existing member unload files to extract and create comparison strings, bucket hashes, and binaries. It is most commonly used when you have made changes to your algorithm but the data itself has not changed. This utility can be run from a command line or from the InfoSphere® MDM Workbench .
- All options and flags are case independent; option values are not.
- Both
-unlInpdir
and-unlInpSegs
are required. - Either one or both of
-unlOutdir -unlOutSegs
or-bxmOutDir
must be specified.
-ixmmode
option when running the
mpxfsdvd utility. If -ixmmode
is not specified, the downstream
mpxlink utility process starts with the current entity set. Setting
-ixmmode
causes the mpxlink utility to re-evaluate all entity
sets, preserving the previous Entity ID when all previous members are present in the new entity
set.Before you run a utility, make sure that you have set the necessary operational server environment variables. For information about the variables, see the operational server environment variables topic.
Option | Type | Description | Default |
---|---|---|---|
-unlInpDir |
dirName | Location of .unl files. The mpxfsdvd utility
reads the member attribute data from the .unl files
in the directory specified here. This directory is relative to the
project work directory on the hub: WAS_PROFILE_HOME\installedApps\YOUR_CELL_NAME\MDM-native-IDENTIFIER.ear\native.war\work\project_name\work\UNL_INPUT_DIR |
NONE |
-unlInpSegs |
segList | List of segments contained by the .unl files | NONE |
-unlOutDir |
dirName | .unl file output directory. The output
of the mpxfsdvd utility is the derived data segments
(comparison, bucket, and, optionally, query data), which have their
own .unl files (mpi_memcmpd, mpi_membktd, and
mpi_memqryd) written to the directory specified here. This directory
is relative to the project work directory on the hub: WAS_PROFILE_HOME\installedApps\YOUR_CELL_NAME\MDM-native-IDENTIFIER.ear\native.war\work\project_name\work\UNL_OUTPUT_DIR Used
with the |
NONE |
-unlOutSegs |
segList | Attribute segments to output. Used with the -unlOutDir option
and indicates whether mpxfsdvd should generate .unl files
during processing. Instructs mpxfsdvd to create .unl files
containing bucket data or comparison data, or instructs mpxfsdvd to
generate query .unl files during processing (the
files are used by the relationship linker). |
NONE |
-encoding |
Encoding of .unl files; options are LATIN1, UTF8, or UTF16 | LATIN1 | |
-bxmOutDir |
dirName | .bin output directory. Indicates where
you want the BXM output files to be located. This directory is relative
to the project work directory on the hub: WAS_PROFILE_HOME\installedApps\YOUR_CELL_NAME\MDM-native-IDENTIFIER.ear\native.war\work\project_name\work\BXM_OUTPUT_DIR |
NONE |
-{no}bxmBktd |
Generate MEMBKTD output. | -bxmBktd |
|
-{no}bxmCmpd |
Generate MEMCMPD output. | -bxmCmpd |
|
-{no}bxmQryd |
Generate MEMQRYD output. This option is for use with the relationship linker and instructs mpxfsdvd to create BXM files containing query data. The relationship types, attributes, and rules should already be defined so that mpxfsdvd knows what data to include in the BXM file. | -bxmQryd |
|
-nMemParts |
N | Number of member partitions. Setting this partition depends on the size of your data set, your algorithms, and how much memory you have access to on the hub. The utility that consumes the mpxfsdvd output (such as mpxfreq) must use a matching “memparts” value. Leave this option at the default unless you need the memory. The higher the member partitions, the slower your mpxcomp process because the hub must do more duplicate comparisons. | 1 |
-nBktParts |
N | Number of bucket partitions. Setting this partition depends on the size of your data set, your algorithms, and how much memory you have access to on the hub. Leave this setting at the default unless you need the memory. | 1 |
-minBktTag |
N | Minimum bucket tag to use (0=any). The lowest bucketing role designation used in the algorithm to include in the process. | 0 |
-maxBktTag |
N | Maximum bucket tag to use (0=any). The highest bucketing role designation used in the algorithm to include in the process. | 0 |
-nQryParts |
N | Number of query partitions. Setting this partition depends on the size of your data set, your algorithms, and how much memory you have access to on the hub. Leave this setting at the default unless you need the memory. This option is enabled only when the option to Generate query BXM is also enabled. | 1 |
-minQryRole |
N | Minimum query role to use (0=all). The lowest query role designation used in the algorithm to include in the process. This option is enabled only when the option to Generate query BXM is also enabled. | 0 |
-buffSize |
N | Size for each file input and output (I/O) buffer. | 65536 |
-memType |
memName | Member type name. If you have multiple member types in the hub database and need to derive data for only one of those member types, select the member type here; otherwise, select ALL. | NONE |
-entType |
entName | Entity type name | NONE |
-skipRecs |
N | Number of member records to skip before re-deriving members
from the specified input files. Processing begins with the next member
read from MEMHEAD after skipping this number of records. When used
with the |
0 |
-maxRecs |
N | Maximum number of member records to re-derive from the specified
input files. When using this parameter along with skipping member
records, this number includes the number skipped. When used with
the |
Unlimited |
-maxErrs |
N | Maximum errors before halting processing. This option sets a threshold for errors in the data. Once the threshold is reached, the mpxfsdvd utility stops. The intent of this option is to allow you to process a set of input .unl files with tolerance for data issues. For example, if the .unl file has an incorrect number of fields, the member record is rejected and re-derivation does not complete for that member. The mpxfsdvd utility writes detailed information into the log file, including the line number, input file, and reason for the rejection. | 100 |
-{no}HeadSql |
flag | Generates SQL output. Instructs the mpxfsdvd utility to generate an SQL file in the specified .unl output directory. If a .unl output directory is not specified then the output is written to the BXM output directory. This SQL file contains a query against the mpi_memhead table for members that were identified as missing. These members are identified when there is an attribute row that does not have a corresponding head row. | -noHeadSql |
-ixmmode |
This true or false option sets the IXM mode. In IXM mode, a subset of members are compared rather than the entire member set. If running a BXM, use the default of false. If running an IXM, set this option to true. | FALSE |