Lookup privacy providers

The lookup privacy providers mask data by using values that are selected from a "lookup" table. The lookup privacy provider and hash lookup privacy provider select values that are based on the source value. The random lookup privacy provider selects random values without regard to the source value.

Certain type of data such as names, addresses, etc., cannot be generated by using arithmetical logic that is used by many of the privacy providers. When this type of data needs to be masked, a similar set of replacement data, such as a lookup table, is required.

Lookup table data is normally stored as a set of rows in a table with a key column. The lookup table fields with replacement data must have names that match the source table fields that receive the replacement data. The lookup table field and the corresponding source table field must also have similar data types.

The providers can mask data in multiple fields. The providers can also keep null, space, and zero-length values in selected source tables, instead of using the lookup value.

Lookup enhancement to support DEST parameter

Starting from IBM InfoSphere Optim Version 11.3.0.4 and beyond, the presence of DEST parameter enables the new behavior and its absence uses old behavior.

Old behavior:
  1. Parameters SEARCH, REPLACE, PRExxx, FLDDEFn take lookup table field names.

LOOKUP: REPLACE, SEARCH, PRExxx

HASH_LOOKUP: REPLACE, PRExxx

RANDOM_LOOKUP: REPLACE, PRExxx

FLDDEFn are required for column names for both source and lookup tables.

New behavior:
  1. SOURCE, DEST, PRExxx and FLDDEFn take field names from the source table.
  2. SEARCH and REPLACE take field names from the lookup table.
  3. There is a 1-1 mapping between the field names in DEST and REPLACE.
  4. For plain Lookup, there is a 1-1 mapping between the field names in SOURCE and SEARCH. SOURCE must be used along with DEST. If DEST is not specified SOURCE and SEARCH must not be specified together.
  5. LUA optimmask() data arguments count must match the number of FLDDEFn (in other words, this is the count of SOURCE and DEST fields, irrespective of whether PRExxx is specified or not).

Hash Lookup

Lookup Table

Lookup table
Lookup table

Single source field, multiple destination fields

Table 1. Source
CUSTID FIRSTNAME LASTNAME
10005 Tim Richards
10003 <null> Bond
10002 Jenny <10 Spaces>
10004 Sarah Reeves
10001 Veronica Parker

Masked with preserve

Old Syntax:
PRO=HASH_LOOKUP,HASHFLD="L_SEQ",SOURCE="CUSTID",REPLACE="L_FNAME,L_LNAME",id="LKP_CUSTOMERS_1",PRENULL="L_FNAME",lib=ORACLE,user=noel,pass=noel,
conn=ORCL,FLDDEF1=(NAME="CUSTID",DT=char),FLDDEF2=(NAME="L_FNAME",DT=varchar_sz),FLDDEF3=(NAME="L_LNAME",DT=varchar_sz)
New Syntax:
PRO=HASH_LOOKUP,HASHFLD="L_SEQ",SOURCE="CUSTID",DEST="FIRSTNAME,LASTNAME",REPLACE="L_FNAME,L_LNAME",
id="LKP_CUSTOMERS_1",PRENULL="FIRSTNAME",
lib=ORACLE,user=noel,pass=noel,conn=ORCL,
FLDDEF1=(NAME="CUSTID",DT=char),FLDDEF2=(NAME="FIRSTNAME",DT=varchar_sz),
FLDDEF3=(NAME="LASTNAME",DT=varchar_sz)
Lua:
custid_val = source.column.getvalue(“CUSTID”)
firstname_val = source.column.getvalue(“FIRSTNAME”)
lastname_val = source.column.getvalue(“LASTNAME”)
mask_fname_value, mask_lname_value = optimmask(custid_val, firstname_val, lastname_val,<param_string>)
target.column.setvalue(“FIRSTNAME”, mask_fname_value)
target.column.setvalue(“LASTNAME”, mask_lname_value)
Table 2. Masked with Preserve
CUSTID FIRSTNAME LASTNAME
10005 Robert Olson
10003 <null> Dunn
10002 Michael York
10004 Allen Perl
10001 Joe Jusino

Masked without preserve

Old Syntax:
PRO=HASH_LOOKUP,HASHFLD="L_SEQ",SOURCE="CUSTID",REPLACE="L_FNAME,
L_LNAME",id="LKP_CUSTOMERS_1",
lib=ORACLE,user=noel,pass=noel,conn=ORCL,FLDDEF1=(NAME="CUSTID",DT=char),
FLDDEF2=(NAME="L_FNAME",DT=varchar_sz),
FLDDEF3=(NAME="L_LNAME",DT=varchar_sz)
New Syntax:
PRO=HASH_LOOKUP,HASHFLD="L_SEQ",SOURCE="CUSTID",DEST="FIRSTNAME,LASTNAME",REPLACE="L_FNAME,L_LNAME",
id="LKP_CUSTOMERS_1",lib=ORACLE,user=noel,pass=noel,
conn=ORCL,FLDDEF1=(NAME="CUSTID",DT=char),
FLDDEF2=(NAME="FIRSTNAME",DT=varchar_sz),
FLDDEF3=(NAME="LASTNAME",DT=varchar_sz)
Lua:
custid_val = source.column.getvalue(“CUSTID”)
firstname_val = source.column.getvalue(“FIRSTNAME”)
lastname_val = source.column.getvalue(“LASTNAME”)
mask_fname_value, mask_lname_value = optimmask(custid_val,firstname_val, lastname_val,<param_string>) 
target.column.setvalue(“FIRSTNAME”, mask_fname_value)
target.column.setvalue(“LASTNAME”, mask_lname_value)
Table 3. Masked without preserve
CUSTID FIRSTNAME LASTNAME
10005 Robert Olson
10003 Rene Dunn
10002 Michael York
10004 Allen Perl
10001 Joe Jusino

Hash Source and destination field is the same (Example: FIRSTNAME)

Masked without preserve

Old Syntax (2 separate FLDDEFn are required):
PRO=HASH_LOOKUP,HASHFLD="L_SEQ",SOURCE="FIRSTNAME",
REPLACE="L_FNAME",id="LKP_CUSTOMERS_1",
lib=ORACLE,user=noel,pass=noel,conn=ORCL,
FLDDEF1=(NAME="FIRSTNAME",DT=varchar_sz), 
FLDDEF2=(NAME="L_FNAME",DT=varchar_sz)
New Syntax:
PRO=HASH_LOOKUP,HASHFLD="L_SEQ",SOURCE="FIRSTNAME",DEST="FIRSTNAME",REPLACE="L_FNAME",
id="LKP_CUSTOMERS_1",lib=ORACLE,user=noel,pass=noel,conn=ORCL,FLDDEF1=(NAME="FIRSTNAME",DT=varchar_sz)
Lua:
firstname_val = source.column.getvalue(“FIRSTNAME”)
mask_fname_value = optimmask(firstname_val, <param_string>)
target.column.setvalue(“FIRSTNAME”, mask_fname_value)
Table 4. Source
FIRSTNAME
Tim
<null>
Jenny
Sarah
Veronica
Table 5. Masked
FIRSTNAME
Joe
r_null
Rene
Joe
Allen

Masked with preserve

Old Syntax (2 separate FLDDEFn are required):
PRO=HASH_LOOKUP,HASHFLD="L_SEQ",SOURCE="FIRSTNAME",REPLACE="L_FNAME", 
PRENULL="L_FNAME",id="LKP_CUSTOMERS_1",lib=ORACLE,user=noel,pass=noel,
conn=ORCL,FLDDEF1=(NAME="FIRSTNAME",DT=varchar_sz), FLDDEF2=(NAME="L_FNAME",DT=varchar_sz)
New Syntax:
PRO=HASH_LOOKUP,HASHFLD="L_SEQ",SOURCE="FIRSTNAME",DEST="FIRSTNAME",
REPLACE="L_FNAME",PRENULL="FIRSTNAME",id="LKP_CUSTOMERS_1",lib=ORACLE,user=noel,
pass=noel,conn=ORCL,FLDDEF1=(NAME="FIRSTNAME",DT=varchar_sz)
Lua:
firstname_val = source.column.getvalue(“FIRSTNAME”)
mask_fname_value = optimmask(firstname_val, <param_string>)
target.column.setvalue(“FIRSTNAME”, mask_fname_value)
Table 6. Source
FIRSTNAME
Tim
<null>
Jenny
Sarah
Veronica
Table 7. Masked
FIRSTNAME
Joe
<null>
Rene
Joe
Allen

Plain Lookup

Lookup Table

Lookup table - Plain lookup

Multiple search fields, single destination field

Source

Table 8. Source
CUSTID FIRSTNAME LASTNAME
10005 Joe Richards
10003 Allen Bond
10002 Rene <10 spaces >
10004 Robert Reeves
10001 Michael Parker

Masked with preserve

Old Syntax:
PRO=LOOKUP,SEARCH="L_CUSTID,L_FNAME",REPLACE="L_LNAME",
OPERATOR=AND,PRESPACES="L_LNAME",
id="LKP_CUSTOMERS_1",lib=ORACLE,user=noel,pass=noel,
conn=ORCL,FLDDEF1=(NAME="L_CUSTID",DT=varchar_sz),
FLDDEF2=(NAME="L_FNAME",DT=varchar_sz),
FLDDEF3=(NAME="L_LNAME",DT=varchar_sz)
New Syntax:
PRO=LOOKUP,SOURCE="CUSTID,FIRSTNAME",SEARCH="L_CUSTID,L_FNAME",DEST="LASTNAME",REPLACE="L_LNAME",
OPERATOR=AND,PRESPACES="LASTNAME",id="LKP_CUSTOMERS_1",
lib=ORACLE,user=noel,pass=noel,conn=ORCL,
FLDDEF1=(NAME="CUSTID",DT=varchar_sz),
FLDDEF2=(NAME="FIRSTNAME",DT=varchar_sz),
FLDDEF3=(NAME="LASTNAME",DT=varchar_sz)
Lua:
custid_val = source.column.getvalue(“CUSTID”)
firstname_val = source.column.getvalue(“FIRSTNAME”)
lastname_val = source.column.getvalue(“LASTNAME”)
mask_lname_value = optimmask(custid_val,firstname_val, lastname_val,<param_string>)
target.column.setvalue(“LASTNAME”, mask_lname_value) 
Table 9. Masked with Preserve
CUSTID FIRSTNAME LASTNAME
10005 Joe Jusino
10003 Allen Perl
10002 Rene <10 spaces>
10004 Robert Olson
10001 Michael York

Masked without preserve

Old Syntax:

PRO=LOOKUP,SEARCH="L_CUSTID,L_FNAME",REPLACE="L_LNAME",
OPERATOR=AND, id="LKP_CUSTOMERS_1",
lib=ORACLE,user=noel,pass=noel,conn=ORCL,
FLDDEF1=(NAME="L_CUSTID",DT=varchar_sz),
FLDDEF2=(NAME="L_FNAME",DT=varchar_sz),
FLDDEF3=(NAME="L_LNAME",DT=varchar_sz)
New Syntax:
PRO=LOOKUP,SOURCE="CUSTID,FIRSTNAME",SEARCH="L_CUSTID,L_FNAME",DEST="LASTNAME",
REPLACE="L_LNAME",OPERATOR=AND,id="LKP_CUSTOMERS_1",
lib=ORACLE,user=noel,pass=noel,conn=ORCL,
FLDDEF1=(NAME="CUSTID",DT=varchar_sz),
FLDDEF2=(NAME="FIRSTNAME",DT=varchar_sz),FLDDEF3=(NAME="LASTNAME",DT=varchar_sz)
Lua:
custid_val = source.column.getvalue(“CUSTID”)
firstname_val = source.column.getvalue(“FIRSTNAME”)
lastname_val = source.column.getvalue(“LASTNAME”)
mask_lname_value = optimmask(custid_val,firstname_val, lastname_val,<param_string>) 
target.column.setvalue(“LASTNAME”, mask_lname_value)
Table 10. Masked without Preserve
CUSTID FIRSTNAME LASTNAME
1005 Joe Jusino
1003 Allen Perl
1002 Rene Dunn
1004 Robert Olson
10001 Michael York

Random Lookup

Lookup Table

Lookup table - Random lookup

Multiple destination fields

Source

Table 11. Source
FIRSTNAME LASTNAME
Tim Richards
<null> Bond
Jenny <10 spaces>
Sarah Reeves
Veronica Parker

Masked with preserve

Old Syntax:
PRO=RANDOM_LOOKUP,REPLACE="L_FNAME,
L_LNAME",id="LKP_CUSTOMERS_1",PRENULL="L_FNAME",
lib=ORACLE,user=noel,pass=noel,
conn=ORCL,FLDDEF1=(NAME="L_FNAME",DT=varchar_sz),
FLDDEF2=(NAME="L_LNAME",DT=varchar_sz)
New Syntax:
PRO=RANDOM_LOOKUP,DEST="FIRSTNAME,LASTNAME",REPLACE="L_FNAME,L_LNAME",id="LKP_CUSTOMERS_1",
PRENULL="FIRSTNAME",lib=ORACLE,user=noel,
pass=noel,conn=ORCL,FLDDEF1=(NAME="FIRSTNAME",DT=varchar_sz),FLDDEF2=(NAME="LASTNAME",DT=varchar_sz)
Lua:
firstname_val = source.column.getvalue(“FIRSTNAME”)
laststname_val = source.column.getvalue(“LASTNAME”)
mask_fname_value, 
mask_lname_value = optimmask(firstname_val, 
lastname_val,<param_string>)
target.column.setvalue(“FIRSTNAME”, mask_fname_value)
target.column.setvalue(“LASTSTNAME”, mask_lname_value)

The output changes on each run:

Source

Table 12. Source
FIRSTNAME LASTNAME
Tim Richards
<null> Bond
Jenny <10 Spaces>
Sarah Reeves
Veronica Parker
Table 13. Masked
FIRSTNAME LASTNAME
Joe Jusino
<null> York
Rene Dunn
r_zerolen r_zerolen
Allen Perl

Masked without preserve

Old Syntax:
PRO=RANDOM_LOOKUP,REPLACE="L_FNAME,L_LNAME",
id="LKP_CUSTOMERS_1",PRENULL="L_FNAME",
lib=ORACLE,user=noel,pass=noel,conn=ORCL,
FLDDEF1=(NAME="L_FNAME",DT=varchar_sz),
FLDDEF2=(NAME="L_LNAME",DT=varchar_sz)
New Syntax:
PRO=RANDOM_LOOKUP,DEST="FIRSTNAME,LASTNAME",
REPLACE="L_FNAME,L_LNAME",id="LKP_CUSTOMERS_1",
PRENULL="FIRSTNAME",lib=ORACLE,user=noel,
pass=noel,conn=ORCL,FLDDEF1=(NAME="FIRSTNAME",
DT=varchar_sz),FLDDEF2=(NAME="LASTNAME",DT=varchar_sz)

Lua:

firstname_val = source.column.getvalue(&ldquo;FIRSTNAME&rdquo;)
laststname_val = source.column.getvalue(&ldquo;LASTNAME&rdquo;)
mask_fname_value, mask_lname_value = optimmask(firstname_val,lastname_val,<param_string>) 
target.column.setvalue(“FIRSTNAME”, mask_fname_value)
target.column.setvalue(“LASTSTNAME”, mask_lname_value)

The output changes on each run:

Table 14. Source
FIRSTNAME LASTNAME
Tim Richards
<null> Bond
Jenny <10 Spaces>
Sarah Reeves
Veronica Parker
Table 15. Masked
FIRSTNAME LASTNAME
Michael York
Rene Dunn
Robert Olson
Rene Dunn
r_null r_null

Appendix

When ODPP is used in Lua, PRESERVE options are easily implemented by Lua script even with the old syntax. See the following sample:
function is_spaces(str)
    str1 = string.gsub(str, " ", "")
    if #str1 == 0 then
        return true
    else
        return false
    end
end

function cm_transform()
    -- optimmask() returns mask values without PRExxxx options.
    value1 = source.column.getvalue("COL_CH1")
    mask2, mask2 = optimmask(value1, 'PRO=HASH_LOOKUP, FLDDEF1=(NAME="COL_CH1",DT=WCHAR),
                                      SRC="COL_CH1",
                                      HASHFLD="SEQ",
                                      REPLACE="COL_CH2, COL_VC3",
                                      ...')
    -- If COL_CH2 is NULL or SPACES, copy it to mask2.
    if value2 == nil then
        mask2 = value2
    elseif is_spaces(value2) == true then
        mask2 = value2
    end

    -- If COL_VC3 is NULL or ZEROLENGTH, copy it to mask3.
    if value3 == nil then
        mask3 = value3
    elseif #value3 == 0 then
        mask3 = value3
    end
 
    target.column.setvalue("COL_CH2", mask2)
    target.column.setvalue("COL_CH3", mask3)
end