MPXDATA configuration file
The mpxdata configuration file is used as a map for reading the customer extract file and as a legend for converting the data to the InfoSphere® MDM database table layout. The configuration file is used by the Derive Data and Create UNLs (mpxdata) job, the mpxdata.exe utility, and when you use InfoSphere DataStage® jobs in MpxData mode.
The configuration file tells the mpxdata utility where each field is located in the extract and how to migrate it into the proper InfoSphere MDM data format. The mpxdata utility then processes member puts (inserts into the database) or member computes (parses out data into segment specific unload, or UNL files).
MDM Workbench provides a method for creating a configuration file or importing an existing configuration file by using the Derive Data and Create UNLs (mpxdata) job set user interface.
When you create a configuration file, remember the following items.
- Some data elements require more than one piece of information to constitute a complete MDM record ready for insertion into the database. For example, Social Security Number requires both the actual ID number and the ID Issuer. In this example, 111223333 is the ID and SSA is the ID Issuer.
- Some data elements can be inserted by using one or more data elements to make up a complete record. For example, the Name field (LGLNAME) can have up to seven elements that can be used to make up a single name record. If used, each one of these elements must be mapped from the source file to a database field inside an InfoSphere MDM database record.
- Some fields, like date fields, can be represented in different ways (such as MMDDYYYY or YYYYMMDD), and the Derive Data and Create UNLs (mpxdata) job must be told how the field is formatted inside the customer extract.
A configuration file is written for each unique data extract file. Therefore, the configuration files are typically designed by using the Data Extract Guide that is provided to the customer. Configuration files define several elements, including the following.
- Attributes to field number in the extract file
- String cook options that remove leading and trailing data
- Methods that set the data type, like String or Date
- Field to field mapping for attribute parts, like Last Name
Typically, the configuration file is named after the data extract file and is in the InfoSphere MDM Workbench project directory.
These two examples illustrate the mapping between a pipe-delimited data extract file that contains actual data to be inserted into the operational server database, and the configuration file, which defines how the data elements for each record are mapped.
CountyHospitalPatients.txt
A|684645|F|LOCKART|ELDORA|C|1953-08-16|6231835023|6233545816|214232158
B|685244|M|TOUGAS|CYRUS|U|1897-09-08|4804104130|4804224618|226241361
A|684673|F|POLHEMUS|IVANA|L|1918-07-17|6231931923|6231826817|523831068
A|684827|M|STILLS|JUNIOR||1945-07-03|4801176228|4802517203|265662872
A|684350|F|PEPPERS|MARYETTA|P|1982-04-04|4809668324|4809041224|265171522
CountyHospitalPatients.cfg
MEMHEAD 1 1 NA SetString srcCode
MEMHEAD 1 2 BL SetString memidnum
SEX 1 3 BL SetString attrval
LGLNAME 1 4 BL SetString onmlast
LGLNAME 1 5 BL SetString onmfirst
LGLNAME 1 6 BL SetString onmmiddle
BIRTHDT 1 7 BL SetDate_Y4MD dateval
HOMEPHON 1 8 ZL SetString phnumber
WORKPHON 1 9 ZL SetString phnumber
SSN 1 10 NA SetString idissuer SSA
SSN 1 11 BL SetString idnumber
Attribute | # | Position | Transform | Assignment | Constant | Comment |
---|---|---|---|---|---|---|
## Member system information | ||||||
MEMHEAD | 1 | 1 | TR | SetString srcCode | ||
MEMHEAD | 1 | 6 | Z1 | SetString memIdnum | ||
FACILITY | 1 | 1 | TR | SetString attrVal | ||
PTENTRID | 1 | 1 | TR | SetString idIssue | ||
PTENTRID | 1 | 5 | ZR | SetString idNumber | ||
PTCHRTID | 1 | 1 | TR | SetString idIssuer | ## Patient Chart ID Issuer | |
PTCHRTID | 1 | 6 | ZR | SetString idNumber | ## Patient Chart ID Number | |
## Patient Information | ||||||
LGLNAME | 1 | 7 | TR | SetString onmLast | ## Patient Name | |
LGLNAME | 1 | 8 | TR | SetString onmFirst | ## Patient Name | |
LGLNAME | 1 | 9 | TR | SetString onmMiddle | ## Patient Name | |
LGLNAME | 1 | 10 | TR | SetString onmSuffix | ## Patient Name | |
BIRTHDT | 1 | 11 | NA | SetDate_ Y4MD dateVal | ## Patient Birthdate | |
SEX | 1 | 12 | TR | SetString attrVal | ## Patient Sex | |
RACE | 1 | 13 | TR | SetString attrVal | ## Patient Race | |
SSN | 1 | 0 | NA | SetString idIssuer | SSA | ## Patient SSN |
SSN | 1 | 14 | ZR | SetString idNumber | ## Patient SSN | |
HOMEADDR | 1 | 15 | TR | SetString stLine1 | ## Patient Home Address | |
HOMEADDR | 1 | 16 | TR | SetString city | ## Patient Home Address | |
HOMEADDR | 1 | 17 | TR | SetString state | ## Patient Home Address | |
HOMEADDR | 1 | 18 | TR | SetString zipcode | ## Patient Home Address | |
HOMEPHON | 1 | 19 | TR | SetString phNumber | ## Patient Phone Number | |
LANGUAGE | 1 | 20 | TR | SetString attrVal | ## Language | |
RELIGION | 1 | 21 | TR | SetString attrVal | ## Religion | |
GTNAME | 1 | 22 | NA | SetString onmLast | ## Guarantor Name(for last) | |
GTNAME | 1 | 23 | NA | SetString onmFirst | ## Guarantor Name(for first middle) | |
GTDOB | 1 | 24 | NA | SetDate_ Y4MD dateVal | ## Guarantor Birthdate | |
GTSSN | 1 | 0 | NA | SetString idIssuer | SSA | ## Guarantor SSN |
GTSSN | 1 | 25 | ZR | SetString idNumber | ## Guarantor SSN | |
GTHADDR | 1 | 26 | NA | SetString stLine1 | ## Guarantor Home Address | |
GTHADDR | 1 | 27 | NA | SetString city | ## Guarantor Home Address | |
GTHADDR | 1 | 28 | NA | SetString state | ## Guarantor Home Address | |
GTHADDR | 1 | 29 | NA | SetString zipcode | ## Guarantor Home Address | |
GTINSNM | 1 | 30 | TR | SetString attrVal | ## Guarantor Ins Name | |
GTINSPOL | 1 | 31 | TR | SetString attrVal | ## Guarantor Ins Policy Number | |
GTINSCOV | 1 | 32 | TR | SetString attrVal | ## Guarantor Ins Coverage | |
GTEMPNM | 1 | 33 | TR | SetString attrVal | ## Guarantor Employer Name | |
GTOCC | 1 | 34 | TR | SetString attrVal | ## Guarantor Occupation |
- Attribute
- This column is used to identify which data element inside the database you are going to populate. This name must match the attribute ATTRCODE inside the mpi_segattr table.
- # (ivar - instance variable number)
- This column defines how many of the same named data elements are in this customer record. For example, if the extract contains three phone numbers and inside the mpi_segattr table you have only one attribute that is defined as PHONE (rather than WRKPH, HOMEPH, CELLPH), you must increment the ivar column for each PHONE entry you have in the configuration file. Create three separate attributes inside mpi_segattr to handle the three phone types mentioned listed here.
- Position (offset)
- This column identifies the position of the field within the customer extract to insert into the MDM database record. An offset of 0 indicates that a constant value is inserted into this field, and not pulling the value from the extract. The example that is shown is inserting a constant value of SSA as the ID Issuer of the SSN.
- length
- If the Input File Format for the job is set to Fixed, then the
length in bytes of the field must be specified in the fourth column
position. If the Input File Format is set to Delimited, then remove
the length column from the configuration file. The following table
shows an example of a configuration file whose Input File Format is
fixed:
attrCode ivar offset Length strcook setData constant Comment ## Member system information MEM HEAD 1 0 0 NA SetString srcCode RMC MEM HEAD 1 1 6 Z1 SetString memIdnum LGL NAME 1 7 25 TR SetString onmLast LGL NAME 1 32 25 TR SetString onmFirst HOME ADDR 1 57 20 ZR SetString stLine1 HOME ADDR 1 77 20 TR SetString city HOME ADDR 1 97 2 TR SetString state HOME ADDR 1 99 5 TR SetString zipCode HOME ADDR 1 104 3 TR SetString phArea HOME ADDR 1 107 8 TR SetString phNumber - Transform (strcook)
- This column is the optional edit method or cook method that you
can apply to the customer data element before it is inserted into
the MDM database. These methods are basic, and can be used to remove
blanks or zeros from the right or left side of the data element if
necessary. If you run mpxdata from a command line, the strcook and
setData methods can be displayed by using the -methods option.
The following list describes the available transform options.
- NA - do nothing
- TR - trim leading and trailing blanks, allow empty string
- BL - trim blanks from the left, allow empty string
- ZL - trim zeros from the left, allow empty string
- B1 - trim blanks from the left, leave at most 1
- Z1 - trim zeros from the left, leave at most 1
- BR - trim blanks from the right, allow empty string
- ZR - trim zeros from the right, allow empty string
- ZX - trim only if all zeros, creating empty string
- Assignment (setData)
- This column is the database method that is used to populate the MDM record with the customer data. Each data type has its own set of setData methods. You must take care to select the correct one for the data element that you are inserting. If you run mpxdata from a command line, the cook and setData methods can be displayed by using the -methods option. The following lists the available setData methods.
- SetString
- SetNumber
- SetDate_MDY4
- SetDate_MDY2
- SetDate_MDCY2
- SetDate_Y4MD
- SetDate_Y2MD
- SetDate_PMDY
- SetDate_PYMD
- SetDate_DMY4
The SetDate* format methods work on date input data, but the destination in the database is expected to be a String (character field).
The setData method is followed by the column name into which you want to insert the data.
- Constant
- This field is optional. If present, this setting represents a constant value that is inserted into the MDM database for the column indicated with setData. It does not physically appear in the record, but behaves as though it did. If used, set the offset (and length, if applicable) to 0. For example:
SSN 1 0 NA SetString idIssuer SSA ## Patient SSN
The string SSA is written to the idIssuer column of the SSN attribute. You can also use the constant to set the asaIdxNo field. You can also use the constant to set the asaIdxNo field, as shown in this example.
attrCode ivar offset strcook setData constant Comment NAME 1 0 NA SetString asaIdxNo 1 NAME 1 5 TR SetString onmFirst NAME 1 6 TR SetString onmMiddle NAME 1 7 TR SetString onmLast NAME 2 0 NA SetString asaIdxNo 2 NAME 2 8 TR SetString onmFirst NAME 2 9 TR SetString onmMiddle NAME 2 10 TR SetString onmLast - Comments
- If you want to add comments to your configuration file, open and
edit the file in a text editor. Place any comment that you want to
make about a field at the end of the row and precede the comment text
by a hash (pound) symbol. This comment is not saved in the MDM database.
# Patient Information SSN 1 0 NA SetString idNumber ## Patient SSN