Overview of the Optim data privacy components

The Optim™ data privacy components are part of the IBM® InfoSphere® Optim Data Privacy Solution. You can use the components to mask sensitive data propagated outside your production environment, such as national IDs, credit card numbers, and email addresses. You can mask sensitive data by replacing it with fictional, yet contextually accurate data. This process makes the data appropriate for testing and other legitimate business uses, but useless to identity thieves and hackers.

The masking capabilities of the Optim data privacy components are designed to supplement and extend the masking capabilities available in the Optim solution through column maps.

The terms data privacy and data masking are often used interchangeably. Both terms imply that proprietary information of some sort must be protected from improper disclosure or abuse. The Optim data privacy components are used to mask sensitive data, so you can achieve data privacy through data masking.

There are three Optim data privacy components: the data privacy application, the data privacy providers, and the data privacy user-defined functions (UDFs).

For information on the database versions and platforms that the data privacy components support, see the detailed system requirements for the Optim data privacy components.

The Optim data privacy application

The data privacy application is used to mask data in CSV and XML files, including data originating in a Hadoop Distributed File System (HDFS). The data privacy application offers two methods of masking data:
  • A graphical user interface (GUI) that includes selection menus and other tools to specify the appropriate parameters for each masking job
  • A command-line interface (CLI) that uses user-defined configuration files to identify the data to be masked

You can mask data in CSV and XML files with both interfaces, and you can mask CSV data originating in Hadoop with the command-line interface. The data masking capabilities of the data privacy application are enabled through a set of callable functions called Optim data privacy providers.

The Optim data privacy provider library (ODPP)

The data privacy library is a stand-alone API that provides a flexible and extensible means of accessing predefined and user-developed data masking providers. The library includes a set of data privacy algorithms that are called data privacy providers, which are sometimes referred to as data masking providers. Various providers are available to users to mask data. These include providers that are designed to mask credit card numbers, national IDs, email addresses, birth dates and other dates, and undifferentiated or dynamically formatted values.

The data privacy API can be used by applications that are written in various languages. Its modular, plug-and-play approach makes the providers suitable for use in numerous IBM products and client applications. The providers are data source independent, and support most data types and character sets, including ASCII, Unicode, and Multibyte. You can use this built-in flexibility to standardize data privacy policies across your entire enterprise.

The Optim data privacy providers are included in the following Optim solutions:
  • Optim Test Data Management Solution with Data Masking option
  • Optim Data Masking solution
  • Optim Data Privacy Enterprise/Workgroup Editions
  • Optim Test Data Management Enterprise/Workgroup Editions

If you use any of these solutions, you do not have to install the providers separately because they are automatically installed with these solutions. However, if you want to use the providers outside of these solutions, you must install the providers separately.

The Optim providers are also installed, by default, during the wizard-driven installation of the Optim data privacy application. They are also manually installed during the installation of the Optim data privacy UDFs. If needed, however, you can install the providers separately for use with other IBM products or client programs.

The Optim user-defined functions for data privacy

You can mask sensitive data in various database management systems (DBMSs) and platforms with the Optim data privacy UDFs. You can include the UDFs in SQL scripts or statements to dynamically mask data within the framework of a DBMS server. The data is masked in place, without leaving the database.

You can use the UDFs to call the Optim data privacy providers to mask sensitive data, such as national IDs, credit card numbers, and email addresses. You can use the providers within SQL queries to mask data in place in any supported database.

The UDF installation media includes software for all supported platforms and bit format variations, including Microsoft Windows, UNIX, Linux®, and z/OS® in 31-bit, 32-bit, and 64-bit formats, where appropriate.

Note: When the Optim data privacy provider library (ODPP) is downloaded, the Optim data privacy UDFs are included in the library. However, the UDFs are installed separately from the stand-alone version of the data privacy providers. For more information, see Overview for installing and configuring the Optim data privacy user-defined functions.