DATA_MASK scalar function
You can use the DATA_MASK built-in scalar function to do partial redaction, format preserving masking, strong cryptographic obfuscation, and integrity preserving tokenization. The function produces masked data that can be used by analytics tools to create insights without revealing sensitive data. You use the DATA_MASK function in the masking definition of a CREATE MASK statement.
Note: This scalar function is
currently available on Db2®
11.5.9
and later for the following 64-bit Linux platforms:
- AMD (linuxamd64).
- Z Linux (linux390x64).
- Power PC (linuxppcle64).
The schema is SYSIBM.
- DATA_MASK
- The data type of the returned value is the same as the data type of the expression.
- expression
- An expression that returns a value of any built-in data type that is not CLOB, BLOB, DBCLOB, VARGRAPHIC, GRAPHIC, and XML.
- mask-type
- An expression that specifies masking operation to be performed. See Table 1 for available
masking operations. The expression must return a value that is a built-in INTEGER. Note: If the value is not of type INTEGER, it is implicitly cast to INTEGER before evaluating the function.
- mask-parameters
- The expression must return a value that is either a built-in CHAR or VARCHAR data type.
- mask-format
- The expression must return a value that is either a built-in CHAR or VARCHAR data type. It should be one of the format from table 4 below that representing the expression.
Examples
The following example shows the command syntax for a REDACT
operation where
mask-type
has a value of zero, and REDACT PRESERVE LENGTH
has a mask-type
value of 1. The data type is VARCHAR
. For this
example, the EMPLOYEE
table has at least one record where
FIRST_NAME
is John:
SELECT DATA_MASK(FIRST_NAME, 0, 'X','','') FROM EMPLOYEE
1
----------
X
1 record(s) selected
SELECT DATA_MASK(FIRST_NAME, 0, 'XX','','') FROM EMPLOYEE
1
----------
XX
1 record(s) selected.
SELECT DATA_MASK(FIRST_NAME, 1, 'X','','') FROM EMPLOYEE
1
----------
XXXX
1 record(s) selected.
The following example shows the command syntax for running a
SUBSTITUTE
operation where mask-type
has a value of 2 and DECFLOAT
is the
input data type:VALUES DATA_MASK(DECFLOAT(12.345),2,'','','0123456789abcdef')
1
------------------------------------------
5.9549086170364273E+307
1 record(s) selected.
The following example shows the command syntax for running a
SUBSTITUTE
operation where mask-type
has a value of 2 and
TIMESTAMP
is the input data
type:VALUES DATA_MASK(TIMESTAMP('2022-09-23 23:21:56'),2,'','','0123456789abcdef')
1
--------------------------
2074-08-26-21.06.57.153000
1 record(s) selected.
The following example shows the command syntax for running a
PARTIAL REDACT
operation where mask-type
has a value of 3. The
example operation uses includes regular and slice expressions that redact the middle and first four
characters from the end of the
input:VALUES DATA_MASK(VARCHAR('9038-3434-2345'),3,'RGX="^\d{4}[-]?(\d{4})"##char=A;SLC=0,4,1,Y','','')
1
--------------
9038-AAAA-YYYY
1 record(s) selected.
The following example shows the command syntax for running a
FORMAT PRESERVE
operation where mask-type
has a value of 4 and the
input format is
USFirstName
:VALUES DATA_MASK(CAST('John' as VARCHAR(10)), 4, '','USFirstName','0123456789abcdef')
1
----------
winfred
1 record(s) selected.
The following example shows the command syntax for running a
FORMAT PRESERVING OBFUSCATION
operation mask-type
has a value of 4
and the input format is
ArkansasStateDriversLicense
:VALUES DATA_MASK(VARCHAR('900000000'),4,'','ArkansasStateDriversLicense','0123456789abcdef')
1
---------
982576755
1 record(s) selected.
The following example shows the command syntax for running a
FORMAT PRESERVING OBFUSCATION
operation where mask-type
has a
value of 4 and the input format is
UsaSocialSecurityNumber
:VALUES DATA_MASK(VARCHAR('123-45-6789'),4,'','UsaSocialSecurityNumber','0123456789abcdef')
1
-----------
680-47-4470
1 record(s) selected.
The following example shows the command syntax for running a
FORMAT PRESERVING OBFUSCATION
operation where mask-type
has a
value of 4 and the input format is
TIMESTAMP
:VALUES DATA_MASK(TIMESTAMP('2022-11-23 11:10:23.324'),4,'minDate=2001-01-01;maxDate=2099-12-31','','0123456789abcdef')
1
--------------------------
2068-10-15-03.59.29.174000
The following example shows the command syntax for running a
DATE AGING
operation where DATE is the input format and TIMESTAMP
is the input data type:VALUES DATA_MASK(DATE('2020-01-29'), 5, 'days=2', '', '')
1
----------
2020-01-31
1 record(s) selected.
VALUES DATA_MASK(TIMESTAMP('2022-11-23 11:10:23.324'), 5, 'days=-2', '', '')
1
--------------------------
2022-11-21-11.10.23.324000
1 record(s) selected.
The following example shows the command syntax for running an
IDENTIFIER
operation:VALUES DATA_MASK(VARCHAR('AB1234'),7,'','','0123456789abcdef')
1
------
TW9980
1 record(s) selected.
VALUES DATA_MASK(VARCHAR('AB#12@34'),7,'','','0123456789abcdef')
1
--------
TW#99@80
1 record(s) selected
The following example shows the difference between a
FORMAT PRESERVING FABRICATION
operation (mask-type 10) where input validation does
not occur and a FORMAT PRESERVING OBFUSCATION
(mask-type 4) where the inputs are
validated:VALUES DATA_MASK(CAST('hfhfhf' as VARCHAR(10)),10,'','USFirstName','0123456789abcdef')
1
----------
kimiko
1 record(s) selected.
VALUES DATA_MASK(CAST('hfhfhf' as VARCHAR(10)),4,'','USFirstName','0123456789abcdef')
1
----------
XXXXXXXXXX
1 record(s) select