DATA_MASK scalar function
You can use the DATA_MASK built-in scalar function to do partial redaction, format preserving masking, strong cryptographic obfuscation, and integrity preserving tokenization. The function produces masked data that can be used by analytics tools to create insights without revealing sensitive data. You use the DATA_MASK function in the masking definition of a CREATE MASK statement.
Note: This scalar function is
currently available on Db2®
11.5.9
and later for the following 64-bit Linux platforms:
- AMD (linuxamd64).
- Z Linux (linux390x64).
- Power PC (linuxppcle64).
The schema is SYSIBM.
- DATA_MASK
- The data type of the returned value is the same as the data type of the expression.
- expression
- An expression that returns a value of any built-in data type that is not CLOB, BLOB, DBCLOB, VARGRAPHIC, GRAPHIC, and XML.
- mask-type
- An expression that specifies masking operation to be performed. See Table 1 for available
masking operations. The expression must return a value that is a built-in INTEGER. Note: If the value is not of type INTEGER, it is implicitly cast to INTEGER before evaluating the function.
- mask-parameters
- The expression must return a value that is either a built-in CHAR or VARCHAR data type.
- mask-format
- The expression must return a value that is either a built-in CHAR or VARCHAR data type. It should be one of the format from table 4 below that representing the expression.
Examples
The following example shows the command syntax for a REDACT operation where
mask-type has a value of zero, and REDACT PRESERVE LENGTH
has a mask-type value of 1. The data type is VARCHAR. For this
example, the EMPLOYEE table has at least one record where
FIRST_NAME is John:
SELECT DATA_MASK(FIRST_NAME, 0, 'X','','') FROM EMPLOYEE
1
----------
X
1 record(s) selected
SELECT DATA_MASK(FIRST_NAME, 0, 'XX','','') FROM EMPLOYEE
1
----------
XX
1 record(s) selected.
SELECT DATA_MASK(FIRST_NAME, 1, 'X','','') FROM EMPLOYEE
1
----------
XXXX
1 record(s) selected.
The following example shows the command syntax for running a
SUBSTITUTE
operation where mask-type has a value of 2 and DECFLOAT is the
input data type:VALUES DATA_MASK(DECFLOAT(12.345),2,'','','0123456789abcdef')
1
------------------------------------------
5.9549086170364273E+307
1 record(s) selected.The following example shows the command syntax for running a
SUBSTITUTE operation where mask-type has a value of 2 and
TIMESTAMP is the input data
type:VALUES DATA_MASK(TIMESTAMP('2022-09-23 23:21:56'),2,'','','0123456789abcdef')
1
--------------------------
2074-08-26-21.06.57.153000
1 record(s) selected.The following example shows the command syntax for running a
PARTIAL REDACT operation where mask-type has a value of 3. The
example operation uses includes regular and slice expressions that redact the middle and first four
characters from the end of the
input:VALUES DATA_MASK(VARCHAR('9038-3434-2345'),3,'RGX="^\d{4}[-]?(\d{4})"##char=A;SLC=0,4,1,Y','','')
1
--------------
9038-AAAA-YYYY
1 record(s) selected.The following example shows the command syntax for running a
FORMAT PRESERVE operation where mask-type has a value of 4 and the
input format is
USFirstName:VALUES DATA_MASK(CAST('John' as VARCHAR(10)), 4, '','USFirstName','0123456789abcdef')
1
----------
winfred
1 record(s) selected.The following example shows the command syntax for running a
FORMAT PRESERVING OBFUSCATION operation mask-type has a value of 4
and the input format is
ArkansasStateDriversLicense:VALUES DATA_MASK(VARCHAR('900000000'),4,'','ArkansasStateDriversLicense','0123456789abcdef')
1
---------
982576755
1 record(s) selected.The following example shows the command syntax for running a
FORMAT PRESERVING OBFUSCATION operation where mask-type has a
value of 4 and the input format is
UsaSocialSecurityNumber:VALUES DATA_MASK(VARCHAR('123-45-6789'),4,'','UsaSocialSecurityNumber','0123456789abcdef')
1
-----------
680-47-4470
1 record(s) selected.The following example shows the command syntax for running a
FORMAT PRESERVING OBFUSCATION operation where mask-type has a
value of 4 and the input format is
TIMESTAMP:VALUES DATA_MASK(TIMESTAMP('2022-11-23 11:10:23.324'),4,'minDate=2001-01-01;maxDate=2099-12-31','','0123456789abcdef')
1
--------------------------
2068-10-15-03.59.29.174000The following example shows the command syntax for running a
DATE AGING operation where DATE is the input format and TIMESTAMP
is the input data type:VALUES DATA_MASK(DATE('2020-01-29'), 5, 'days=2', '', '')
1
----------
2020-01-31
1 record(s) selected.
VALUES DATA_MASK(TIMESTAMP('2022-11-23 11:10:23.324'), 5, 'days=-2', '', '')
1
--------------------------
2022-11-21-11.10.23.324000
1 record(s) selected.The following example shows the command syntax for running an
IDENTIFIER
operation:VALUES DATA_MASK(VARCHAR('AB1234'),7,'','','0123456789abcdef')
1
------
TW9980
1 record(s) selected.
VALUES DATA_MASK(VARCHAR('AB#12@34'),7,'','','0123456789abcdef')
1
--------
TW#99@80
1 record(s) selectedThe following example shows the difference between a
FORMAT PRESERVING FABRICATION operation (mask-type 10) where input validation does
not occur and a FORMAT PRESERVING OBFUSCATION (mask-type 4) where the inputs are
validated:VALUES DATA_MASK(CAST('hfhfhf' as VARCHAR(10)),10,'','USFirstName','0123456789abcdef')
1
----------
kimiko
1 record(s) selected.
VALUES DATA_MASK(CAST('hfhfhf' as VARCHAR(10)),4,'','USFirstName','0123456789abcdef')
1
----------
XXXXXXXXXX
1 record(s) select