Lookup privacy provider
Use the lookup privacy provider to obtain replacement values from one or more lookup table fields, according to the value in one or more source table fields.
The lookup privacy provider can select lookup table values based on values from single or multiple source table fields. The single field lookup searches a lookup table for a source field value, and if the value is found, the provider inserts a corresponding lookup table value into the destination field. The multiple field lookup searches rows in the lookup table for values from multiple source fields, and if the values are found, the provider inserts values from the lookup table into destination fields. Multiple field lookup can use an AND or OR logical operator to match source values, allowing the provider to use lookup values if all source values match (AND operator) or use lookup values if any source values match (OR operator).
If a matching lookup value is not available, the provider can preserve the source value or generate an error.
The provider can maintain a cache of lookup values, which can improve performance but requires extra memory or disk space.
Examples
- Single source values
The following example searches the empnonum field in the lookup table for a row with a value that matches the empnonum source field. If a row with matching data is found, the provider uses replacement values from the hiredate and salary fields in the emp_lkp lookup table to mask the hiredate and salary fields in the source data.
pro=lkp, srch="empnonum”, rep="hiredate, salary", lib=db2luw, id="emp_lkp", conn=odpplkpu, user=testuser, pass=testpass, flddef1=(name=empnonum, dt=int), flddef2=(name=hiredate, dt=date), flddef3=(name=salary, dt=char)
This example uses the following parameters:
- SRCH="empnonum"
- This parameter specifies the source table field (empnonum) to use for search values. If the empnonum field in the lookup table has a matching value, the provider uses replacement values from the lookup table row with the matching value.
- REP=" hiredate, salary"
- This parameter specifies the lookup table fields (hiredate and salary) that provides replacement values for the hiredate and salary fields in the source table.
- ID="emp_lkp"
- This parameter specifies the name of the lookup table, emp_lkp.
- LIB="db2luw"
- This parameter specifies the DBMS of the lookup table, DB2® for Linux®, UNIX, and Windows.
- Multiple source values
The following example searches the empnonum and firstname fields in the lookup table for a row with values that match the empnonum and firstname source fields. If a row with matching data is found, the provider uses replacement values from the lastname, hiredate, and salary fields in the emp_lkp lookup table to mask the lastname, hiredate, and salary fields in the source data.
pro=lkp, srch="empnonum, firstname”, rep="lastname, hiredate, salary", lib=db2luw, id="emp_lkp", conn=odpplkpu, user=testuser, pass=testpass, flddef1=(name=empnonum, dt=int), flddef2=(name=firstname, dt=char), flddef3=(name=lastname, dt=char), flddef4=(name=hiredate, dt=date), flddef5=(name=salary, dt=char)
This example uses the following parameters:
- SRCH="empnonum, firstname"
- This parameter specifies the source table fields (empnonum and firstname) to use for search values. If the empnonum and firstname fields in the lookup table have matching values, the provider uses replacement values from the lookup table row with the matching values.
- REP="lastname, hiredate, salary"
- This parameter specifies the lookup table fields (lastname, hiredate, and salary) that provides replacement values for the lastname, hiredate, and salary fields in the source table.
- ID="emp_lkp"
- This parameter specifies the name of the lookup table, emp_lkp.
- LIB="db2luw"
- This parameter specifies the DBMS of the lookup table, DB2 for Linux, UNIX, and Windows.
Syntax
The lookup privacy provider uses the following syntax:
- Masking parameters
PROVIDER = LOOKUP ,
- Source parameters
SEARCH = “srchfld1-name , srchfldn-name“ , [ SRC_DATE_FORMAT = “src-date-format-expression“ ] , [ PRESERVENULL = “fld1-name , fldn-name“ ] , [ PRESERVESPACES = “fld1-name , fldn-name“ ] , [ PRESERVEZEROLENGTH = “fld1-name , fldn-name“
- Lookup parameters
REPLACE = “fld1-name , fldn-name“ , [ OPERATOR = { OR | AND } ] , [ WHENNOTFOUND = { PRESERVE | ERROR } ] ,
- Cache parameters
[ CACHE = { MEMORY | MEMFILE | N } ] , [ CACHESIZE = cache-size-integer ] , [ TMPDIR = “ temp-directory-location ] “ ,
- Connection parameters
ID = identifier-string , LIB = { DB2LUW | DB2ZOS | ORACLE | lds-library-name } , USER = user-name , PASS = password , CONN = connection-string ,
- Processing parameters
[ DISCARDLIMIT = discard-limit-value ] ,
- Data definition parameters
FLDDEFn = ( NAME = field-name, DATATYPE = datatype-value, [ PRECISION = field-precision-value ], [ SCALE = field-scale-value ], [ LENGTH = field-length-value ], [ CODEPAGE = codepage-value ], [ CPTYPE = { DB2ZOS |DB2LUW | ORACLE |SYBASE |ODBC | INFORMIX |NETEZZA |SQLSERVER |TERADATA |ANY |NONE } ] ) , [ CODEPAGE = codepage-value ] , [ CPTYPE = { DB2ZOS | DB2LUW | ORACLE | SYBASE | ODBC | INFORMIX | NETEZZA | SQLSERVER | TERADATA | ANY | NONE } ]
Masking parameters
Parameters that determine how to mask data.
- PROVIDER (or PRO)
- Required. Enter the provider name, LKP (or LOOKUP).Note: The PRO parameter must be first in the masking string. All other parameters can appear in any order.
Source parameters
Parameters for managing source values.
- SEARCH (or SRCH)
- Required. Specifies the names of the source table fields that
contain the search values.
Use commas to separate a list of source field names. If a field name has spaces or commas, enclose the name in double quotation marks. If a field name has a double quotation mark character, precede the double quotation mark character with a backslash (\) escape character.
- PRESERVENULL (or PRENULL)
- Specifies source fields in which NULL values are preserved and
not replaced with lookup table values.
Use commas to separate a list of source field names. If a field name has spaces or commas, enclose the name in double quotation marks. If a field name has a double quotation mark character, precede the double quotation mark character with a backslash (\) escape character.
- PRESERVESPACES (or PRESPACES)
- Specifies source fields in which blank values are preserved and
not replaced with lookup table values.
Use commas to separate a list of source field names. If a field name has spaces or commas, enclose the name in double quotation marks. If a field name has a double quotation mark character, precede the double quotation mark character with a backslash (\) escape character.
- PRESERVEZEROLENGTH (or PREZL)
- Specifies source fields in which zero length values are preserved
and not replaced with lookup table values.
Use commas to separate a list of source field names. If a field name has spaces or commas, enclose the name in double quotation marks. If a field name has a double quotation mark character, precede the double quotation mark character with a backslash (\) escape character.
- SRC_DATE_FORMAT (or SRCDF)
- Specifies, within enclosing double quotation marks, the format
of the date string values in the source field.Use any of the following format specifiers for formatting the date-time strings. All specifiers start with a percent (%) sign. If SRCDF is not specified, the provider uses the following default date format: %DD/%MM/%YYYY %HH24:%MI:%SS:%FFFFFF.
- Year
- %YYYY
- %YY
- Month
- %MONTH
- %MMM
- %MM
- Day
- %DD
- Time
- %HH
- %MI
- %SS
- %HH24
- Fraction of a second
- %FFFFFF
- %FFFFF
- %FFFF
- %FFF
- %FF
- %F
When converting a timestamp or time value in a string format to a privacy provider format, use as many fractions of a second as possible. The provider will automatically handle the fractions if the source value fractions are less than what is specified by SRCDF. Formatting fractions in this manner is useful when want to use the same SRCDF for multiple values with fractions that vary in precision.
- AM/PM
- %PM
For example, to format a date string as four-digit year, three-character month, two-digit date with dash-type (-) separators, the format string would be “%YYYY-%MMM-%DD”.
Lua column map procedures read the date/time values as a string and follows the system defined date/time format. To use SRCDF in LUA column map procedures, set SCRDF to the system date/time format. In case the format is not clear, write a small LUA column map procedure to print the value of date/time column and use the output to prepare a SRCDF. Please note that this also makes the LUA column map procedure dependent on the system format on which they are run, so the same procedure might require a change in SRCDF if the procedure is run on a system with different date/time format.
Lookup parameters
Parameters for managing lookup table values.
- REPLACE (or REP)
- Required. Specifies the replacement fields in the lookup table.
Use commas to separate a list of replacement field names. If a field name has spaces or commas, enclose the name in double quotation marks. If a field name has a double quotation mark character, precede the double quotation mark character with a backslash (\) escape character.
- OPERATOR (or OP)
- For multiple column lookup only. Specifies options for matching
source table and lookup table fields. Enter one of the following options:
- AND
- Default. Specifies that the masking process compares the source field values with the lookup field values, and if all the values match, the process uses the lookup values.
- OR
- Specifies that the masking process compares the source field values with the lookup field values, and if any of the values match, the process uses the lookup values.
- WHENNOTFOUND (or WHENNF)
- Specifies how to process search field values with no matches in
the lookup table. Enter one of the following options:
- PRESERVE (or PRE)
- If the source field values value are not found in the lookup table, the source field values are preserved.
- ERROR (or ERR)
- Default. If the source field values value are not found in the lookup table, an error occurs.
Cache parameters
Parameters for managing a cache of lookup values.
- CACHE
- Determines whether to maintain a cache of the replacement values
during processing. Using a cache is faster when the masking process
retrieves a replacement value many times but requires extra memory
and optionally, disk space. Note: It is recommended to specify the length of all character search fields. The length is used to allocate internal data buffers when caching is enabled.
Enter one of the following options:
- MEMFILE
- Default. Specifies that the cache size is determined by the CACHESIZE value, but when the cache threshold is reached, further rows are written to a file cache. By default, this file is written to the platform temporary-type directory. On Windows, this directory is the system-configured temp-type directory. On UNIX and Linux, this directory is the /tmp directory. To specify the memory file location, use TMPDIR. The file cache file name is in the format: ODPP.Pxxxxxxxx.Tyyyyyyyy.lkp where xxxxxxxx and yyyyyyyy are hex values.
- MEMORY (or MEM)
- Specifies that the cache size is determined by the CACHESIZE value.
- N
- Specifies that caching is disabled.
- CACHESIZE
- Specifies the size of the in-memory cache in KB. This size represents only the in-memory size of the cache. If CACHE=MEMFILE , the cache is limited only by the operating system file limitations. Specify a value (in KB) in the range 1 - 2,147,483,647 (2 GB). The default value is 32 KB.
- TMPDIR
- Specifies the location of the cache file when CACHE=MEMFILE.
Connection parameters
Parameters for connecting to a lookup table.
- ID
- Required. Specifies the name of the lookup table.
For DB2 for Linux, UNIX, and Windows and DB2 for z/OS®, specify the schema-name and table-name. Use the format schema.table.
- LIB
- Specifies the name of the lookup data source (LDS) library. Enter
one of the following options:
- DB2LUW
- DB2 for Linux, UNIX, and Windows
- DB2ZOS
- DB2 for z/OS
- MSS
- (Release 11.3.0.5 and later) Microsoft SQL Server
- ORACLE
- Oracle
- lds-library-name
- The name of a user-written LDS library.
- USER
- Required. The user name that is used to access the lookup table database.
- PASS
- Required. The password for the user name that is used to access
the lookup table database.
Use double quotation marks to enclose passwords with special characters. For example, "ABC$".
- CONN
- Required. The connection string for the lookup table database.
Processing parameters
Parameters for managing provider processes.
- DISCARDLIMIT (or DLIM)
- Specifies the number of failed rows to discard before the provider stops processing.
Data definition parameters
Parameters for defining source and target data. For further information see, supported data types.
- FLDDEFn
- Required. Specifies the attributes of input values to use for processing. See Field definition parameter.
- CODEPAGE (or CP)
- An integer value that specifies the codepage or character-set identifier of the source fields. The default is UTF-8. The CP parameter within the FLDDEFn parameter overrides this value.
- CPTYPE (or CPT)
- The codepage type of the source fields. The CPT parameter within the FLDDEFn
parameter overrides this value.
When the origin of the data is DBMS-specific but not tied to any one DBMS, specify the value as ANY. When the origin of the data is from a non-DBMS source, specify the value as NONE. As there are no DBMS-specific code pages for Netezza®, a specification of NONE is implied when Netezza is specified.
Enter one of the following values:
Value Description DBZ (or DB2zOS) DB2 for z/OS DB2 (or DB2LUW) DB2 for Linux, UNIX, and Windows IFX (or INFORMIX) Informix® MSS (or SQLSERVER) Microsoft SQL Server NZ or NETEZZA Netezza ODBC ODBC ORA (or ORACLE) Oracle SYB (or SYBASE) Sybase TD or TERADATA Teradata ANY Any DBMS NONE No DBMS
Supported data types
Data type | Description |
---|---|
CHAR | Fixed size character data that is left justified and space padded. |
WCHAR | Fixed size wide character data that is left justified and space padded. |
VARCHAR | Character data starting with a short integer value that indicates the length, in bytes, of the character data to follow. |
WVARCHAR | Wide character data starting with a short integer value that indicates the length, in bytes, of the wide character data to follow. |
VARCHAR_SZ | Character data string which is terminated by a NULL character. |
WVARCHAR_SZ | Wide character data string that is terminated by a NULL character. |
ODBC_DATE | A data type that is used when the source value is a ODPP_ODBC_DATE structure of the Optim™ data masking API. |
ODBC_TIME | A data type that is used when the source value is a ODPP_ODBC_TIME structure of the Optim data masking API. |
ODBC_TIMESTAMP | A data type that is used when the source value is a ODPP_ODBC_TIMESTAMP structure of the Optim data masking API. |
DATETIME_CHAR | A fixed size character data value that is left justified with space padded and contains date-time values. To format this data type value, the caller must supply a format string using parameter ODPP_OPR_SOURCE_DATE_FORMAT of the Optim data masking API. |
DATETIME_VARCHAR | Character data starting with a short integer value that indicates the length, in bytes, of the character data containing date-time values to follow. To format this data type value the caller must supply a format string using parameter ODPP_OPR_SOURCE_DATE_FORMAT of the Optim data masking API. |
DATETIME_WCHAR | Fixed size wide character data that is left justified with space padded and contains date-time values. To format this data type value the caller must supply a format string using parameter ODPP_OPR_SOURCE_DATE_FORMAT of the Optim data masking API. |
DATETIME_WVARCHAR | Wide character data starting with a short integer value that indicates the length, in bytes, of the wide character data containing date-time values to follow. To format this data type value the caller must supply a format string using parameter ODPP_OPR_SOURCE_DATE_FORMAT of the Optim data masking API. |
DATETIME_SZ | A character data string containing date-time values and is terminated by a NULL character. To format this data type value, the caller must supply a format string using parameter ODPP_OPR_SOURCE_DATE_FORMAT of the Optim data masking API. |
DATETIME_WSZ | A character data string containing date-time values and is terminated by a NULL character. To format this data type, value the caller must supply a format string using parameter ODPP_OPR_SOURCE_DATE_FORMAT of the Optim data masking API. |
DOUBLE | Double precision floating point number in the range 1.7E +/- 308 (15 digits). |
FLOAT | Floating point number in the range 3.4E +/- 38 (7 digits). |
LONG_LONG | An 8 byte signed numeric value in the range -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807. |
U_LONG_LONG | An 8 byte unsigned numeric value in the range 0 to 18,446,744,073,709,551,615. |
INTEGER | A 4 byte signed integer value in the range -2,147,483,648 to 2,147,483,647. |
U_INTEGER | A 4 byte unsigned integer value in the range 0 to 4,294,967,295. |
SMALLINT | A 2 byte signed integer value in the range -32,768 to 32,767. |
U_SMALLINT | A 2 byte unsigned integer value in the range 0 to 65,535. |
TINYINT | A single byte signed integer value in the range -128 to 127. |
U_TINYINT | A single byte unsigned integer value in the range 0 to 255. |