Best practices: Optim data privacy user-defined functions for Oracle

This topic describes validated best practices and tips for implementing the Optim™ data privacy user-defined functions (UDFs) for Oracle.

Use Direct mode for UDF installation/configuration in Oracle 11g and later

In Oracle releases 11g and later, use the Direct mode when you install and configure the data privacy UDFs. Oracle defines Direct as the default mode in these releases and recommends it. In Direct mode, the extproc process is spawned by the database, not the Listener. Therefore, the Direct mode provides enhanced security and delivers better performance than the Listener mode.

Disable indexes and triggers during masking operations

To improve overall performance during masking in place and Insert operations, disable nonessential indexes and table-level triggers during the masking process.

Use the NOLOGGING option to reduce redo log

Use the NOLOGGING option and the APPEND hint to reduce redo log information during CREATE TABLE AS SELECT (CTAS) or Insert operations with the data privacy UDFs.

Use appropriate commit frequency during in-place update with the data privacy UDFs

Commit frequency can affect performance by controlling the unit of work during in-place masking. A high number of commits could slow down the overall masking operation. Work with your database administrator (DBA) to design and implement the optimal commit frequency, based on available database resources. Strive to maximize the commit frequency while you manage the restart of the masking operation, in case of failure.

Use appropriate Affinity masking algorithm (default or FPE)

The Affinity data privacy provider includes two options for masking data: default and FPE (format-preserving encryption). While the FPE algorithm offers an industry-standard level of encryption, consider using the default algorithm when your data does not require such encryption. The most significant differences between the two algorithms are the variety of masking patterns, the time required to complete the masking effort, and the strength of the masked results.

The default algorithm usually outperforms the FPE algorithm in processing time, as the FPE algorithm takes additional time to complete in most cases. The complexity of the source data can add more time to the masking operation of the FPE algorithm, depending on the amount of parallel processing and I/O being performed.

The following table identifies the level of complexity of certain types of source data.
Data Complexity factor
12345678 05
123456ABCDEF 06
123456-ABCDEF 06
AAA-BBB-CCC-DDD 07
123-AAA-456-BBB 07
johndoe@domain.com 08
123456 johndoe@domain.com 10

The FPE algorithm parameters, ITERATIONS and TWEAKS, can influence masking performance. Although a higher ITERATIONS value will result in more secure masked output, the overall processing will slow down. The TWEAKS parameter ensures that identical strings are masked with unique strings, but turning this parameter on also affects performance.

For more information on the Affinity data privacy provider, see Affinity privacy provider.

Use the hash data privacy provider and SQL JOIN to mask data with lookup tables

You can use the hash data privacy provider and the SQL JOIN operation to mask data using lookup tables. For example, assume you are using the Optim lookup table, OPTIM_US_FIRSTNAME, which is structured as follows:
Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------

 SEQ                                       NOT NULL NUMBER
 FIRSTNAME                                 NOT NULL VARCHAR2(60)
The following statements illustrate how to mask customer first names using the OPTIM_US_FIRSTNAME lookup table and the hash data privacy provider with the HASHMAX parameter.
Note: The HASHMAX parameter is required and should equal the count of the positive SEQ values in the lookup table.
SQL> SELECT cust.first_name fname, flookup.FIRSTNAME fname_mask                                  
     FROM perfadm.customer cust, perfadm.OPTIM_US_FIRSTNAME flookup
     WHERE fLOOKUP.SEQ = OPTIMMASKSTR( cust.FIRST_NAME, ‘PRO=HASH,FLDDEF1=(NAME=X,DT=varchar),
     FLDDEF2=(NAME=OUT,DT=varchar), DSTFLD=OUT, HASHMAX=5160') ;

FNAME                FNAME_MASK
--------------------------------
JUDIE                ADALINE
ELLENA               ABRAM
For more information on the hash data privacy provider, see Hash privacy provider

Use Oracle external tables to move masked data

The data privacy UDFs for Oracle integrate with the Oracle external table feature and create masked data sets that can be moved across databases. For example, create an external table, as illustrated in the following statement:
CREATE TABLE mask_credit_card_XT(ccn, ccnmask)
       organization external
       (type oracle_datapump default directory extdir location ('ccnmas_ext.dmp'))
       as select ccn, OptimMaskStr (ccn, 'pro=ccn, mtd=random, Flddef1=(name=col1,dt=char)') 
       ccnmask from cust_creditcard_tab;
Then use the external table dump file to insert data into the target database, as illustrated in the following statement:
CREATE TABLE cust_creditcard_tab_Target(col1 ..col_n)   
       organization external
       (type oracle_datapump default directory optimdump location ('ccnmas_ext.dmp'));

Use the Oracle CAST function to manage masking in Insert and CTAS operations

In some situations, the data type of the data being masked might not match the data types that the data privacy UDF expects. You can use the Oracle CAST function to convert the data type of one or more source columns when you want to mask multiple columns using a single data privacy UDF. The benefits and flexibility of the CAST function outweigh the slight decrease in performance you may experience.

The following statements illustrate how to use the CAST function with a data privacy UDF. Assume that you are masking a credit card column, visa, stored as a NUMBER data type, using the OptimMaskStr UDF.
INSERT INTO ccn10m_target select visa, OptimMaskStr(CAST (visa as varchar2(30)), 
'pro=ccn,mtd=repeatable, wheninv=pre, Flddef1=(name=col1,dt=char)') visacast from ccn10m_source 
CREATE TABLE ccn10m_cast as select visa, OptimMaskStr(CAST (visa as varchar2(30)), 
'pro=ccn,mtd=repeatable, wheninv=pre, Flddef1=(name=col1,dt=char)') visacast from ccn10m;

Use operating system tools

Operating system tools can help you monitor, troubleshoot, and resolve problems quickly in the data privacy UDF environment. For example, in Microsoft™ Windows™, you can use Process Explorer. It tells you whether the UDF is invoked by the Listener or Direct mode. It also identifies the environment variables the extproc.exe process is using to communicate with the UDF.