Masking data by using Mask at Read semantics in Data Virtualization
Configure Mask at Read semantics so that Data Virtualization masks your data before evaluating the query.
About this task
Enable the Mask at Read global configuration setting to apply masking rules to your data before Data Virtualization evaluates any part of the query. The default setting is Mask at Result, which applies masking to the result set instead.
With Mask at Read, you add an extra layer of protection to your data by controlling how column masking rules defined by IBM® Knowledge Catalog are applied.
For more information on masking tables in Data Virtualization with IBM Knowledge Catalog, see Masking virtual data in Data Virtualization.
- WHERE
- JOIN (INNER / LEFT / RIGHT / LEFT OUTER / RIGHT OUTER / FULL OUTER)
- HAVING
- IN
- GROUP BY
- ORDER BY
Required roles
- Administrator role
Use cases
- Mask at Read example
With Mask at Read, data is masked with the value
Consider the following query that you might use to determine the salary of Holly Rhodes:0
, preserving your data privacy throughout the query lifecycle.SELECT Salary FROM employee_info WHERE Name='Holly Rhodes' AND Salary=132000
- Mask at Read example using masked query predicates
- Consider the case where you need to fetch anonymized
PATIENT_ID
, and then issue follow-up queries predicated on that anonymized ID, without knowing the real (unmasked)PATIENT_ID
. Unlike Mask at Result semantics, with Mask at Read semantics you do not need to know and use unmasked values to issue query predicates.
Restrictions
- If you enable Mask at Read, Data Virtualization evaluates row filtering rules that reference masked columns by using the unmasked values of the reference table.
- Query predicates are not pushed down when they reference masked columns, which might cause a
negative impact on query performance compared to Mask at Result (default behavior). Instead Data
Virtualization does the following actions:
- Fetch all of the data from the source objects participating in the query.
- Mask all of the rows with applicable data protection rules.
- Apply the predicates.
- If you reference a masked column in a query predicate (like in a WHERE clause), and apply a type
casting function (such as CAST, VARCHAR, INTEGER and etc.), the Data
Virtualization query optimizer
does not push down that type cast to the data source. Instead, the query optimizer applies type
casting after the masking function is applied. As a result, data might be retrieved from the source
in a data type that is not compatible with the column data type of the virtualized table, causing an
SQLCODE=-420, SQLSTATE=22018
error. To remediate this issue, ensure you specify query predicates that match the data type of the virtualized column.For example: If the column AGE was virtualized as CHAR/VARCHAR type, then specify AGE='75' instead of AGE=75.
Procedure
- Applying the Mask at Read setting
- To apply masking to data before evaluating the query, run this procedure:
CALL DVSYS.SETWKCPOLICYOPTS('MASK_SEMANTICS=READ')
- Reverting to the default Mask at Result setting
- To revert to the default behavior where queries are evaluated on unmasked data, while masking
the result set, run this
procedure:
CALL DVSYS.SETWKCPOLICYOPTS('MASK_SEMANTICS=RESULT')
- Check your currently configured masking semantics
- To check if you are currently configured to use Mask at Read or Mask at Result, run this
query:
SELECT VALUE FROM DVSYS.INSTANCE_INFO WHERE KEY='WKC_POLICY_OPTS'
Note: An empty string is equivalent to Mask at Result (default behavior).