Lookup privacy providers
The lookup privacy providers mask data by using values that are selected from a "lookup" table. The lookup privacy provider and hash lookup privacy provider select values that are based on the source value. The random lookup privacy provider selects random values without regard to the source value.
Certain type of data such as names, addresses, etc., cannot be generated by using arithmetical logic that is used by many of the privacy providers. When this type of data needs to be masked, a similar set of replacement data, such as a lookup table, is required.
Lookup table data is normally stored as a set of rows in a table with a key column. The lookup table fields with replacement data must have names that match the source table fields that receive the replacement data. The lookup table field and the corresponding source table field must also have similar data types.
The providers can mask data in multiple fields. The providers can also keep null, space, and zero-length values in selected source tables, instead of using the lookup value.
Lookup enhancement to support DEST parameter
Starting from IBM InfoSphere Optim Version 11.3.0.4 and beyond, the presence of DEST parameter enables the new behavior and its absence uses old behavior.
- Parameters SEARCH, REPLACE, PRExxx, FLDDEFn take lookup table field names.
LOOKUP: REPLACE, SEARCH, PRExxx
HASH_LOOKUP: REPLACE, PRExxx
RANDOM_LOOKUP: REPLACE, PRExxx
FLDDEFn are required for column names for both source and lookup tables.
- SOURCE, DEST, PRExxx and FLDDEFn take field names from the source table.
- SEARCH and REPLACE take field names from the lookup table.
- There is a 1-1 mapping between the field names in DEST and REPLACE.
- For plain Lookup, there is a 1-1 mapping between the field names in SOURCE and SEARCH. SOURCE must be used along with DEST. If DEST is not specified SOURCE and SEARCH must not be specified together.
- LUA optimmask() data arguments count must match the number of FLDDEFn (in other words, this is the count of SOURCE and DEST fields, irrespective of whether PRExxx is specified or not).
Hash Lookup
Lookup Table
Single source field, multiple destination fields
CUSTID | FIRSTNAME | LASTNAME |
---|---|---|
10005 | Tim | Richards |
10003 | <null> | Bond |
10002 | Jenny | <10 Spaces> |
10004 | Sarah | Reeves |
10001 | Veronica | Parker |
Masked with preserve
PRO=HASH_LOOKUP,HASHFLD="L_SEQ",SOURCE="CUSTID",REPLACE="L_FNAME,L_LNAME",id="LKP_CUSTOMERS_1",PRENULL="L_FNAME",lib=ORACLE,user=noel,pass=noel,
conn=ORCL,FLDDEF1=(NAME="CUSTID",DT=char),FLDDEF2=(NAME="L_FNAME",DT=varchar_sz),FLDDEF3=(NAME="L_LNAME",DT=varchar_sz)
PRO=HASH_LOOKUP,HASHFLD="L_SEQ",SOURCE="CUSTID",DEST="FIRSTNAME,LASTNAME",REPLACE="L_FNAME,L_LNAME",
id="LKP_CUSTOMERS_1",PRENULL="FIRSTNAME",
lib=ORACLE,user=noel,pass=noel,conn=ORCL,
FLDDEF1=(NAME="CUSTID",DT=char),FLDDEF2=(NAME="FIRSTNAME",DT=varchar_sz),
FLDDEF3=(NAME="LASTNAME",DT=varchar_sz)
custid_val = source.column.getvalue(“CUSTID”)
firstname_val = source.column.getvalue(“FIRSTNAME”)
lastname_val = source.column.getvalue(“LASTNAME”)
mask_fname_value, mask_lname_value = optimmask(custid_val, firstname_val, lastname_val,<param_string>)
target.column.setvalue(“FIRSTNAME”, mask_fname_value)
target.column.setvalue(“LASTNAME”, mask_lname_value)
CUSTID | FIRSTNAME | LASTNAME |
---|---|---|
10005 | Robert | Olson |
10003 | <null> | Dunn |
10002 | Michael | York |
10004 | Allen | Perl |
10001 | Joe | Jusino |
Masked without preserve
PRO=HASH_LOOKUP,HASHFLD="L_SEQ",SOURCE="CUSTID",REPLACE="L_FNAME,
L_LNAME",id="LKP_CUSTOMERS_1",
lib=ORACLE,user=noel,pass=noel,conn=ORCL,FLDDEF1=(NAME="CUSTID",DT=char),
FLDDEF2=(NAME="L_FNAME",DT=varchar_sz),
FLDDEF3=(NAME="L_LNAME",DT=varchar_sz)
PRO=HASH_LOOKUP,HASHFLD="L_SEQ",SOURCE="CUSTID",DEST="FIRSTNAME,LASTNAME",REPLACE="L_FNAME,L_LNAME",
id="LKP_CUSTOMERS_1",lib=ORACLE,user=noel,pass=noel,
conn=ORCL,FLDDEF1=(NAME="CUSTID",DT=char),
FLDDEF2=(NAME="FIRSTNAME",DT=varchar_sz),
FLDDEF3=(NAME="LASTNAME",DT=varchar_sz)
custid_val = source.column.getvalue(“CUSTID”)
firstname_val = source.column.getvalue(“FIRSTNAME”)
lastname_val = source.column.getvalue(“LASTNAME”)
mask_fname_value, mask_lname_value = optimmask(custid_val,firstname_val, lastname_val,<param_string>)
target.column.setvalue(“FIRSTNAME”, mask_fname_value)
target.column.setvalue(“LASTNAME”, mask_lname_value)
CUSTID | FIRSTNAME | LASTNAME |
---|---|---|
10005 | Robert | Olson |
10003 | Rene | Dunn |
10002 | Michael | York |
10004 | Allen | Perl |
10001 | Joe | Jusino |
Hash Source and destination field is the same (Example: FIRSTNAME)
Masked without preserve
PRO=HASH_LOOKUP,HASHFLD="L_SEQ",SOURCE="FIRSTNAME",
REPLACE="L_FNAME",id="LKP_CUSTOMERS_1",
lib=ORACLE,user=noel,pass=noel,conn=ORCL,
FLDDEF1=(NAME="FIRSTNAME",DT=varchar_sz),
FLDDEF2=(NAME="L_FNAME",DT=varchar_sz)
PRO=HASH_LOOKUP,HASHFLD="L_SEQ",SOURCE="FIRSTNAME",DEST="FIRSTNAME",REPLACE="L_FNAME",
id="LKP_CUSTOMERS_1",lib=ORACLE,user=noel,pass=noel,conn=ORCL,FLDDEF1=(NAME="FIRSTNAME",DT=varchar_sz)
firstname_val = source.column.getvalue(“FIRSTNAME”)
mask_fname_value = optimmask(firstname_val, <param_string>)
target.column.setvalue(“FIRSTNAME”, mask_fname_value)
FIRSTNAME |
---|
Tim |
<null> |
Jenny |
Sarah |
Veronica |
FIRSTNAME |
---|
Joe |
r_null |
Rene |
Joe |
Allen |
Masked with preserve
PRO=HASH_LOOKUP,HASHFLD="L_SEQ",SOURCE="FIRSTNAME",REPLACE="L_FNAME",
PRENULL="L_FNAME",id="LKP_CUSTOMERS_1",lib=ORACLE,user=noel,pass=noel,
conn=ORCL,FLDDEF1=(NAME="FIRSTNAME",DT=varchar_sz), FLDDEF2=(NAME="L_FNAME",DT=varchar_sz)
PRO=HASH_LOOKUP,HASHFLD="L_SEQ",SOURCE="FIRSTNAME",DEST="FIRSTNAME",
REPLACE="L_FNAME",PRENULL="FIRSTNAME",id="LKP_CUSTOMERS_1",lib=ORACLE,user=noel,
pass=noel,conn=ORCL,FLDDEF1=(NAME="FIRSTNAME",DT=varchar_sz)
firstname_val = source.column.getvalue(“FIRSTNAME”)
mask_fname_value = optimmask(firstname_val, <param_string>)
target.column.setvalue(“FIRSTNAME”, mask_fname_value)
FIRSTNAME |
---|
Tim |
<null> |
Jenny |
Sarah |
Veronica |
FIRSTNAME |
---|
Joe |
<null> |
Rene |
Joe |
Allen |
Plain Lookup
Lookup Table
Multiple search fields, single destination field
Source
CUSTID | FIRSTNAME | LASTNAME |
---|---|---|
10005 | Joe | Richards |
10003 | Allen | Bond |
10002 | Rene | <10 spaces > |
10004 | Robert | Reeves |
10001 | Michael | Parker |
Masked with preserve
PRO=LOOKUP,SEARCH="L_CUSTID,L_FNAME",REPLACE="L_LNAME",
OPERATOR=AND,PRESPACES="L_LNAME",
id="LKP_CUSTOMERS_1",lib=ORACLE,user=noel,pass=noel,
conn=ORCL,FLDDEF1=(NAME="L_CUSTID",DT=varchar_sz),
FLDDEF2=(NAME="L_FNAME",DT=varchar_sz),
FLDDEF3=(NAME="L_LNAME",DT=varchar_sz)
PRO=LOOKUP,SOURCE="CUSTID,FIRSTNAME",SEARCH="L_CUSTID,L_FNAME",DEST="LASTNAME",REPLACE="L_LNAME",
OPERATOR=AND,PRESPACES="LASTNAME",id="LKP_CUSTOMERS_1",
lib=ORACLE,user=noel,pass=noel,conn=ORCL,
FLDDEF1=(NAME="CUSTID",DT=varchar_sz),
FLDDEF2=(NAME="FIRSTNAME",DT=varchar_sz),
FLDDEF3=(NAME="LASTNAME",DT=varchar_sz)
custid_val = source.column.getvalue(“CUSTID”)
firstname_val = source.column.getvalue(“FIRSTNAME”)
lastname_val = source.column.getvalue(“LASTNAME”)
mask_lname_value = optimmask(custid_val,firstname_val, lastname_val,<param_string>)
target.column.setvalue(“LASTNAME”, mask_lname_value)
CUSTID | FIRSTNAME | LASTNAME |
---|---|---|
10005 | Joe | Jusino |
10003 | Allen | Perl |
10002 | Rene | <10 spaces> |
10004 | Robert | Olson |
10001 | Michael | York |
Masked without preserve
Old Syntax:
PRO=LOOKUP,SEARCH="L_CUSTID,L_FNAME",REPLACE="L_LNAME",
OPERATOR=AND, id="LKP_CUSTOMERS_1",
lib=ORACLE,user=noel,pass=noel,conn=ORCL,
FLDDEF1=(NAME="L_CUSTID",DT=varchar_sz),
FLDDEF2=(NAME="L_FNAME",DT=varchar_sz),
FLDDEF3=(NAME="L_LNAME",DT=varchar_sz)
PRO=LOOKUP,SOURCE="CUSTID,FIRSTNAME",SEARCH="L_CUSTID,L_FNAME",DEST="LASTNAME",
REPLACE="L_LNAME",OPERATOR=AND,id="LKP_CUSTOMERS_1",
lib=ORACLE,user=noel,pass=noel,conn=ORCL,
FLDDEF1=(NAME="CUSTID",DT=varchar_sz),
FLDDEF2=(NAME="FIRSTNAME",DT=varchar_sz),FLDDEF3=(NAME="LASTNAME",DT=varchar_sz)
custid_val = source.column.getvalue(“CUSTID”)
firstname_val = source.column.getvalue(“FIRSTNAME”)
lastname_val = source.column.getvalue(“LASTNAME”)
mask_lname_value = optimmask(custid_val,firstname_val, lastname_val,<param_string>)
target.column.setvalue(“LASTNAME”, mask_lname_value)
CUSTID | FIRSTNAME | LASTNAME |
---|---|---|
1005 | Joe | Jusino |
1003 | Allen | Perl |
1002 | Rene | Dunn |
1004 | Robert | Olson |
10001 | Michael | York |
Random Lookup
Lookup Table
Multiple destination fields
Source
FIRSTNAME | LASTNAME |
---|---|
Tim | Richards |
<null> | Bond |
Jenny | <10 spaces> |
Sarah | Reeves |
Veronica | Parker |
Masked with preserve
PRO=RANDOM_LOOKUP,REPLACE="L_FNAME,
L_LNAME",id="LKP_CUSTOMERS_1",PRENULL="L_FNAME",
lib=ORACLE,user=noel,pass=noel,
conn=ORCL,FLDDEF1=(NAME="L_FNAME",DT=varchar_sz),
FLDDEF2=(NAME="L_LNAME",DT=varchar_sz)
PRO=RANDOM_LOOKUP,DEST="FIRSTNAME,LASTNAME",REPLACE="L_FNAME,L_LNAME",id="LKP_CUSTOMERS_1",
PRENULL="FIRSTNAME",lib=ORACLE,user=noel,
pass=noel,conn=ORCL,FLDDEF1=(NAME="FIRSTNAME",DT=varchar_sz),FLDDEF2=(NAME="LASTNAME",DT=varchar_sz)
firstname_val = source.column.getvalue(“FIRSTNAME”)
laststname_val = source.column.getvalue(“LASTNAME”)
mask_fname_value,
mask_lname_value = optimmask(firstname_val,
lastname_val,<param_string>)
target.column.setvalue(“FIRSTNAME”, mask_fname_value)
target.column.setvalue(“LASTSTNAME”, mask_lname_value)
The output changes on each run:
Source
FIRSTNAME | LASTNAME |
---|---|
Tim | Richards |
<null> | Bond |
Jenny | <10 Spaces> |
Sarah | Reeves |
Veronica | Parker |
FIRSTNAME | LASTNAME |
---|---|
Joe | Jusino |
<null> | York |
Rene | Dunn |
r_zerolen | r_zerolen |
Allen | Perl |
Masked without preserve
PRO=RANDOM_LOOKUP,REPLACE="L_FNAME,L_LNAME",
id="LKP_CUSTOMERS_1",PRENULL="L_FNAME",
lib=ORACLE,user=noel,pass=noel,conn=ORCL,
FLDDEF1=(NAME="L_FNAME",DT=varchar_sz),
FLDDEF2=(NAME="L_LNAME",DT=varchar_sz)
PRO=RANDOM_LOOKUP,DEST="FIRSTNAME,LASTNAME",
REPLACE="L_FNAME,L_LNAME",id="LKP_CUSTOMERS_1",
PRENULL="FIRSTNAME",lib=ORACLE,user=noel,
pass=noel,conn=ORCL,FLDDEF1=(NAME="FIRSTNAME",
DT=varchar_sz),FLDDEF2=(NAME="LASTNAME",DT=varchar_sz)
Lua:
firstname_val = source.column.getvalue(“FIRSTNAME”)
laststname_val = source.column.getvalue(“LASTNAME”)
mask_fname_value, mask_lname_value = optimmask(firstname_val,lastname_val,<param_string>)
target.column.setvalue(“FIRSTNAME”, mask_fname_value)
target.column.setvalue(“LASTSTNAME”, mask_lname_value)
The output changes on each run:
FIRSTNAME | LASTNAME |
---|---|
Tim | Richards |
<null> | Bond |
Jenny | <10 Spaces> |
Sarah | Reeves |
Veronica | Parker |
FIRSTNAME | LASTNAME |
---|---|
Michael | York |
Rene | Dunn |
Robert | Olson |
Rene | Dunn |
r_null | r_null |
Appendix
function is_spaces(str)
str1 = string.gsub(str, " ", "")
if #str1 == 0 then
return true
else
return false
end
end
function cm_transform()
-- optimmask() returns mask values without PRExxxx options.
value1 = source.column.getvalue("COL_CH1")
mask2, mask2 = optimmask(value1, 'PRO=HASH_LOOKUP, FLDDEF1=(NAME="COL_CH1",DT=WCHAR),
SRC="COL_CH1",
HASHFLD="SEQ",
REPLACE="COL_CH2, COL_VC3",
...')
-- If COL_CH2 is NULL or SPACES, copy it to mask2.
if value2 == nil then
mask2 = value2
elseif is_spaces(value2) == true then
mask2 = value2
end
-- If COL_VC3 is NULL or ZEROLENGTH, copy it to mask3.
if value3 == nil then
mask3 = value3
elseif #value3 == 0 then
mask3 = value3
end
target.column.setvalue("COL_CH2", mask2)
target.column.setvalue("COL_CH3", mask3)
end