Using Optim with Informix Dynamic Server, Part 2: Scenarios for using Optim with IDS

Part 1 of this series showed you how to configure IBM® Informix® Dynamic Server with IBM Optim®. In this tutorial, walk through some scenarios to see how using Optim Data Privacy Solution with Informix can help you solve real-world problems.

Dhanashri Kudgavkar (dkudgavk@in.ibm.com), Technical lead, Informix Interoperability, IBM

Dhanashri Kudgavkar photoDhanashri Kudgavkar leads the Informix - Interoperability team at the India Software Lab. She has extensive experience with Informix, its behaviors, and its features.



Prasanna Alur Mathada (amprasanna@in.ibm.com), Informix-Interoperability Team, India Software Lab, IBM

Prasanna Mathada photoPrasanna Mathada is a certified Informix professional. He currently works for the Interoperability team, ISL, India.



29 April 2010

Also available in Chinese Portuguese Spanish

Before you start

Introduction

Part 1 of this series introduced both Informix Dynamic Server and Optim. It showed how to configure the IDS server and IDS client for Optim on both the Microsoft® Windows® and the Linux® (or UNIX®) environments. In addition, it stepped you through the configuration of Optim, setting up the Optim directory and DB alias.

In this tutorial, follow scenarios to understand how you can use the Optim functionality to mask test data and maintain privacy.

Optim Data Privacy Solution with IDS

As we become a smarter planet, organizations are going paperless and relying heavily on computers to store their client data. Hence data governance becomes a very critical process for business success. There are laws related to safeguarding the data privacy of critical client data since (along with customer loyalty and trust) revenue of millions of dollars can be at stake.

One of the major environments where data breaches can happen is the testing environment. Optim provides powerful data transformation capabilities to mask or convert certain personal information so that confidential data can be made available for application testing.

This tutorial covers some of the transformation library functions delivered by Optim to convert critical information such as Social Security number (TRANS SSN), credit card number (TRANS CCN), and e-mail IDs (TRANS EML). There are several other functions available in Optim in the data privacy area that provide various masking functions. For example, some of the functions include:

  • LOOKUP, a function that uses a lookup table to determine the destination column value
  • HASH_LOOKUP, a function that determines the destination column value from a lookup table according to a value derived from a source column
  • RAND_LOOKUP, a random lookup function that selects a value for the destination column from a lookup table based on a random number as the subscript in the lookup table

Consider an example of a government system that provides special benefits to certain citizens based on their annual income. The job of gathering the required data for these citizens is outsourced, and the application processing the data accepts data from various organizations and puts it in an IDS database. The application testing team has access to all the confidential data for these citizens, including their Social Security numbers, e-mail IDs, credit card numbers, annual income, and so on.

In the following tutorial sections, see how various TRANS functions can be used in a convert request to generate masked data that has the same appearance and behavior as the original data, fits into the context of the original data, meets the application logic, and serves the purpose of maintaining citizen privacy.

Prerequisites

This tutorial assumes you have the following already installed and configured after following the instructions in Part 1:

  • Informix Dynamic Server 11.50xC5
  • Informix CSDK 3.50xC5
  • Optim products:
    • IBM Optim Data Growth Solution 7.1.1
    • IBM Optim Test Data Management Solution 7.1.1
    • IBM Optim Data Privacy Solution 7.1.1

In all the scenarios mentioned in this tutorial, the Informix CSDK must reside on the same machine with the Optim server, irrespective of where the IDS Server resides.


Get started with Optim

After configuring the IDS client and server, and with the creation of Optim directory and DB alias, you're ready to explore all the Optim functionalities with Informix Dynamic Server as the back-end database.

  1. Start the Optim application.
  2. You need to choose an Optim directory to work with that you created earlier using the Optim configuration wizard. Click on File > Optim Directory, as shown in Figure 1:
    Figure 1. Optim - Choose an Optim directory
    Screenshot showing drop-down list with Optim Directory as an option
  3. This opens up a window, as shown in Figure 2, with a list of Optim directory entries.

    Choose the Optim directory to work with, and then click on Connect.

    Figure 2. Optim - Select an Optim directory from list of existing directories
    Screenshot showing OPTIMDIR as the one Optim Directory listed

    Once the connection is established, you can perform the operations such as Insert, Extract, Archive, or Compare, choosing them from the Actions menu, as shown in Figure 3:

    Figure 3. Optim - Various actions that you can perform using Optim
    Screenshot showing available actions: Archive, compare, convert, delete, edit, extract, insert, load, report, and restore

    For more information on these actions, refer to the documentation bundled with Optim.


Mask Social Security number using transformation library function TRANS SSN

To mask Social Security number, a unique identification number for every citizen in the USA, use the TRANS SSN function.

  1. Select File > New > Actions > Convert, as illustrated in Figure 4:
    Figure 4. Choosing the Convert option
    Screenshot showing how to select File > New > Actions > Convert
  2. The convert action presents a Convert Request Editor. Provide values for the name and location of the source file and destination file, name of the control file, table map, process options, and an optional description, as illustrated in Figure 5 and detailed in Table 1.
    Figure 5. Convert Request Editor
    Screenshot showing input areas for the parameters provided in Table 1

    Table 1 gives a detailed description of the input fields:

Table 1. Description of variables on Convert Request Editor
VariableDescription
DescriptionOptional details related to the convert operation being performed.
Source FileA data file that holds the data to convert. The file can have either .af or .xf as an extension, where the former is an archived file and the latter an extract file. If the file does not have any extensions, then, by default, it will be treated as an extract file.
Control FileA file that records all the specifications of a process and its end results, such as success or failure. The control file has a .cf extension and will be generated automatically when the process is initiated. The control file allows you to review and correct possible errors before retrying or restarting the process, and to also review errors that have occurred during the process execution.
Destination FileA file containing the data obtained as a result of performing the convert operation or process.
Table Map OptionsA table map is used to match tables in the source file to tables in the destination file or to exclude tables from the process.

You can choose to have either a local or named table map. If opt to have a local table map, then you do not need to specify a name. Otherwise, you must provide a name for the table map and then edit the table map (Tools > Edit Table Map).
Process OptionsOptionally, you can choose to specify, discard row limit, and convert an archive or extract file to a comma-separated values (.csv) file.
  1. To proceed, edit the table map. To do this, either click on the Table Map Editor icon, or choose Tools > Edit Table Map Editor.

    The Table Map Editor screen is divided into several sections and input fields:

    • The Source section is read only, and has information about the location of the extract file and the qualifier, which is the default DB alias and creator ID for source tables in the extract or source archive file.
    • The Destination section also has a qualifier, which takes the value <DBAlias>.<Creator ID>.
    • Use the Description input field to describe the purpose of the Table Map.
    • Use the Tables tab to review mapped database tables or views. The following fields are available under the Tables tab:
      • The Source Table values cannot be changed.
      • The Destination Table values can be changed. The initial values for these will be the same as those listed for the source.
      • The Type field specifies the type of the object specified as destination table (for example, a table or view).
      • Column Map or LOCAL is used to correlate the columns in the pair of tables specified in the Table Map Editor. The value "LOCAL" under the column "Column Map or LOCAL" indicates that the column map is available to this particular table map.
    Figure 6. Table Map Editor
    Screenshot showing source, destination, and tables sections as described

    Use a column map when data transformations are needed. It provides specifications to match or exclude columns from processing. It can be local to a table map, or it can have a specific name, which is the column map ID.

  2. Perform a right mouse-click on the column value LOCAL, and choose the option Open Column Map from the menu. This brings up the Column Map Editor, displaying all the columns available in the source data file and their relevant mapping names on a destination.

Optim Data Server makes use of the TRANS SSN function to perform the masking operation on Social Security number (SSN). A SSN is made of three subfields. The first three digits (area) are allocated according to the state in which the SSN is issued. The next two digits define a group number. The last four digits are the sequential serial number.

Table 2 lists the flags that are used along with the TRANS SSN function to vary the output to suit the requirement for relevant scenarios:

Table 2. TRANS SSN flags and their descriptions
FlagsDescription
nGenerates a random area number and an appropriate group number and serial number.
rGenerates a random area number corresponding to the state of the source SSN and an appropriate group number and serial number.
vValidates the source group number to ensure the SSN has used it.
-Generates an SSN with dashes separating the fields (for example, 123-45-6789). Requires a character-type destination column at least 11 characters long.

In the next section, look at three scenarios that depict the usage of the flags n, r, and v. Figures help illustrate the column values before masking and the converted column values after masking.

SSN scenario 1: Use the n flag

Figure 7 shows the Column Map Editor. The fourth column, ntnl_id_char_9_n, is targeted for the masking operation, so the source column value is replaced by the string TRANS SSN ("=n"). The values in the column ntnl_id_char_9_n should be generated randomly, irrespective of the source.

Figure 7. SSN scenario 1 - Column Map Editor
Screenshot showing source column and data type, mapped to destination column and data type, and indicates status

Select File > Update and Return from both the Column Map Editor and Table Map Editor to return to the Convert Request Editor screen. Save the settings, and execute the contents by selecting the File > Run. Once the execution is completed, you should see the screen illustrated in Figure 8—the Convert Process Report—which provides detailed information about the execution of the convert request.

Figure 8. SSN scenario 1 - Convert Process Report
Screenshot showing Request Name, Server Name, file details, userid details, time details, control file, status, process summary, and row details

Once a the masking operation using the convert request and n flag is completed successfully you can compare the source file and extract file. To do so, choose the Browse option from the Optim Utilities menu. By default, the extract file is place under the Data directory of the Optim installation structure.

Figure 9 shows the data in the source file, and Figure 10 shows the data in the output file, that is, the extract file. Take a close look at the values in the column ntnl_id_char_9_n. The values on the extract file are randomly changed from that of the source file.

Figure 9. SSN scenario 1 - Source extract file
Screenshot showing source values for SSN
Figure 10. SSN scenario 1 - Destination extract file
Screenshot showing SSN values transformed to random values

SSN scenario 2: Use the r flag

Repeat the steps from Figures 4, 5, and 6. Now, replace the source column with the value ntnl_id_vch2_9_n with "TRANS SSN ("=r")," as shown in Figure 11. The values in the column NATIONAL_ID_VCH2_9_N should be generated based on the source's area number.

Figure 11. SSN scenario 2 - Column Map Editor
Screenshot showing TRANS SSM with r flag

Select File > Update and Return on the Column Map Editor and Table Map Editor to return to the Convert Request Editor window. Save the contents and execute the Run option from the File menu. On a successful execution, the masked output will be available in the destination file.

The following two figures depict the browsed output of the source and destination files, respectively:

Figure 12. SSN scenario 2 - Source extract file
Screenshot showing source values
Figure 13. SSN Scenario 2 - Destination extract file
Screenshot showing ntnl_id_char9_n with randomly modified area portion of the number

SSN scenario 3: Use the v flag

Repeat the steps from Figures 4, 5, and 6. Now, replace the source column with the value ntnl_id_vch2_254_n with "TRANS SSN ('=v NTNL_ID_VCH2_9_N')", as shown in Figure 14. Doing so, the area number of the values in the column ntnl_id_vch2_254_n will be validated based on the numbers used by ntnl_id_vch2_9_n.

Figure 14. SSN scenario 3 - Column Map Editor
Screenshot showing column value TRANS SSN ('=v NTNL_ID_VCH2_9_N')

Proceed by selecting the option Update and Return from the File menu on the Column Map Editor and Table Map Editor to return to the Convert Request Editor window. Save the contents and execute the Run option from the File menu. Upon a successful execution, the masked output will be available in the destination file.

The following two figures depict the browsed output of Source and destination files, respectively:

Figure 15. SSN scenario 3 - Source extract file
Screenshot showing source values for SSN
Figure 16. SSN scenario 3 - Destination extract file
Screenshot showing ntnl_id_vch2_254_n validated based on the numbers used by ntnl_id_vch2_9_n

Mask credit card number using transformation library function TRANS CCN

The TRANS CCN function is used to generate valid and unique credit card numbers (CCNs). A CCN usually consists of a six-digit issuer identifier followed by a variable-length account number and a single-check digit as the final number. By default, TRANS CCN algorithmically generates a consistently altered CCN, but it can also generate a random value for CCN.

The execution steps followed in this section (used to mask credit card numbers) are similar to the steps followed during the execution of the SSN scenarios. All three scenarios described in this section differ from each other with respect to the source file, control file, destination file, and flags used in TRANS CCN function.

Table 3 lists the flags used to mask the credit card numbers and their descriptions:

Table 3. TRANS CCN flags and their descriptions
FlagsDescription
nGenerates a random CCN, not based on a source value, that includes a valid issuer identifier associated with American Express, MasterCard, VISA, or Discover.
rGenerates a random CCN that includes the first four digits of the source issuer identifier.
6Generates a random CCN that includes the first six digits of the source issuer identifier.

CCN scenario 1: Use the n flag

Follow the steps from Figures 4, 5, and 6, altering the entries for source file, control file, and destination file accordingly to suit the environment. Now, proceed to the Column Map Editor, as shown in Figure 17.

Edit the source column to replace the column value ccn_id_char_256_n with "TRANS CCN ("=n CCN_ID_CHAR_13_N")." Doing so, values in the column ccn_id_char_256_n will be generated randomly, irrespective of the source provider.

Figure 17. CCN scenario 1 - Column Map Editor
Screenshot showing Column Map Editor with column value TRANS CCN ('=n CCN_ID_CHAR_13_N')

Proceed by selecting the option Update and Return from the File menu on the Column Map Editor and Table Map Editor to return to the Convert Request Editor window. Save the contents and execute the Run option from the File menu. Upon successful execution, the masked output will be available in the destination file.

The following two figures depict the browsed output of the source and destination files, respectively. Take a close look at the column values of the column ccn_id_char_256_n.

Figure 18. CCN scenario 1 - Source extract file
Screenshot showing source values for CCN
Figure 19. CCN scenario 1 - Destination extract file
Screenshot showing CCN values transformed to random values

CCN scenario 2: Use the r flag

Follow the steps from Figures 4, 5, and 6, altering the entries for source file, control file, and destination file accordingly to suit the environment. Now, proceed to the Column Map Editor, as shown in Figure 20.

Edit the source column to replace the column value ccn_id_nchr_256_n with "TRANS CCN ("=r CCN_ID_NCHR_13_N")." Doing so, values in the column ccn_id_nchr_256_n will be generated such that the first four digits represent the same provider as the source.

Figure 20. CCN scenario 2 - Column Map Editor
Screenshot showing Column Map Editor with column value TRANS CCN ('=r CCN_ID_NCHR_13_N')

Proceed by selecting the option Update and Return from the File menu on the Column Map Editor and Table Map Editor to return to the Convert Request Editor window. Save the contents and execute the Run option from the File menu. Upon successful execution, the masked output will be available in the destination file.

The following two figures depict the browsed output of source and destination files, respectively. Take a close look at the column values of the column ccn_id_nchr_256_n.

Figure 21. CCN scenario 2 - Source extract file
Screenshot showing source values for CCN
Figure 22. CCN scenario 2 - Destination extract file
Screenshot showing CCN values transformed to random values that include the first four digits of the source issuer identifier

CCN scenario 3: Use the 6 flag

Follow the steps from Figures 4, 5, and 6, altering the entries for source file, control file, and destination file accordingly to suit the environment. Now, proceed to Column Map Editor, as shown in Figure 23.

Edit the source column to replace the column value ccn_id_char_256_n with "TRANS CCN ("=6 CCN_ID_CHAR_14_N")." Doing so, values in the column ccn_id_char_256_n will be generated such that the first six digits represent the same provider as the source.

Figure 23. CCN scenario 3 - Column Map Editor
Screenshot showing Column Map Editor with column value TRANS CCN ('=6 CCN_ID_CHAR_14_N')

Proceed by selecting the option Update and Return from the File menu on the Column Map Editor and Table Map Editor to return to the Convert Request Editor window. Save the contents and execute Run option from the File menu. Upon successful execution, the masked output will be available in the destination file.

The following two figures depict the browsed output of the source and destination files, respectively. Take a careful look at the column values of the column ccn_id_char_256_n.

Figure 24. CCN scenario 3 - Source extract file
Screenshot showing source values for CCN
Figure 25. CCN scenario 3 - Destination extract file
Screenshot showing CCN values transformed to random values that include the first six digits of the source issuer identifier

Mask e-mail IDs using transformation library function TRANS EML

The TRANS EML function is used to generate an e-mail address. An e-mail address has two parts—the user name and the domain name, separated by the at sign (@). The TRANS EML function generates e-mail addresses based on either the destination data or a literal concatenated with a sequential number. The domain name can be formed using either the source data or randomly chosen from a long list of e-mail service providers. E-mail addresses can also be converted to upper or lower case.

The execution steps followed in this section (where the scenarios will work on e-mail accounts) are similar to the steps followed during the execution of SSN scenarios. Hence, usage of only one is given here, and the rest can be executed in the same way with only the change of the flag values.

Table 4 lists the flags used to mask the e-mail entries and their descriptions:

Table 4. TRANS EML flags and their descriptions
FlagsDescription
nIgnores the source value and generates an e-mail address with a random domain name from a list of large e-mail service providers.
.Separates the name1col and name2col values with a period (.).
-Separates the name1col and name2col values with an underscore (_).
iUses only the first character of the name1col value.
lConverts the e-mail address to lower case.
u Converts the e-mail address to upper case.

EML scenario 1: Use the nu flags

Follow the steps from Figures 4, 5, and 6, altering the entries for source file, control file, and destination file accordingly to suit the environment. Now, proceed to the Column Map Editor, as shown in Figure 26.

Edit the source column to replace the column value eml_id_char_512_n_c with "TRANS EML ("=nu EML_ID_CHAR_3_N_C, "satest"")." Doing so, values in the column eml_id_char_512_n_c should be generated with a random domain name ("satest", in this case) and should be converted to upper case.

Figure 26. CCN scenario 1 - Column Map Editor
Screenshot showing Column Map Editor with column value TRANS EML ('=nu EML_ID_CHAR_3_N_C, 'satest'')

Proceed by selecting the option Update and Return from the File menu on the Column Map Editor and Table Map Editor to return to the Convert Request Editor window. Save the contents and execute the Run option from the File menu. Upon successful execution, the masked output will be available in the destination file.

The following two figures depict the browsed output of source and destination files, respectively. Take a careful look at the column values of the column eml_id_char_512_n_c.

Figure 27. EML scenario 1 - Source extract file
Screenshot showing source values for EML
Figure 28. EML scenario 1 - Destination extract file
Screenshot showing EML values generated with a random domain name 'satest,' converted to upper case

Summary

This tutorial series has helped you get started with the configuration required to establish connectivity to Informix Dynamic Server, and creating the Optim directory and DB alias as part of the Optim configuration. With this integration, you can take advantage of the wide feature set and functionalities of both Optim and IDS for their enterprise data management requirements. The scenarios presented in this tutorial gave you an idea of how you can use the functionality to mask test data and maintain privacy.

In today's electronic world, enterprise data management is a big challenge being faced by every organization with growing volumes of data, high-end performance requirements, and privacy breaches that have become very common. Integration of IBM Informix Dynamic Server and IBM Optim serves as a complete solution for enterprise data management.

Resources

Learn

Get products and technologies

  • IBM Data Studio V2.2: Download IBM Data Studio, which provides foundational database development and administration support for the DB2 and Informix family of products at no charge.
  • Optim Development Studio and Optim pureQuery Runtime: Download the free, 30-day trial version of Optim Development Studio, which provides an integrated database development environment for Oracle, DB2, and Informix.
  • Informix Dynamic Server Enterprise and Developer Edition: Download the free, 90-day trial version of Informix Dynamic Server Enterprise Edition, an exceptional online transaction processing (OLTP) database that offers outstanding performance, reliability, scalability and manageability for enterprise and workgroup computing. .
  • Informix Dynamic Server Express Edition: Download a free, trial version of Informix Dynamic Server Express Edition to get started with IDS.
  • Informix CSDK: Download one of the trial or demo versions of Informix CSDK.
  • Optim Database Administrator: Download the free, 30-day trial version of Optim Database Administrator V2.2.2 (previously Data Studio Administrator), which improves database administrators' (DBAs') productivity and reduces application outages by automating and simplifying complex DB2 structural changes.
  • Data Studio Administration Console: Download the Data Studio Administration Console.
  • Build your next development project with IBM trial software, available for download directly from developerWorks.

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Information management on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management
ArticleID=486162
ArticleTitle=Using Optim with Informix Dynamic Server, Part 2: Scenarios for using Optim with IDS
publish-date=04292010