Quick start: Protect data

You can protect data with Watson Knowledge Catalog by creating data protection rules that specify the type of data to protect and the protection method. Data protection rules apply to data assets in all governed catalogs in the platform and copies of those data assets in projects. Read about data protection rules, then watch a video and take a tutorial that’s suitable for users with some knowledge of data masking, but does not require coding. Required permission You must have the Manage data protection rules permission.

Your basic workflow is to create a data protection rule. It goes into effect immediately after you create it.

Read about protecting data

You create data protection rules to identify the data to control and to specify the method of control. Within data protection rules, you can include classifications, data classes, business terms, or tags to identify the data to control. You can choose to deny access to data or to mask sensitive data values.

Data masking helps you protect sensitive data, such as personally identifiable information or restricted business data to avoid the risk of compromising confidential information. It is defined in data protection rules that are enforced for an asset. Depending on the method of data masking, data is redacted, substituted, or obfuscated with retained formatting in the asset preview.

Read more about data masking

Watch a video about masking data

Watch Video Watch this video to see how to create data protection rules to mask data using different masking types.

Video disclaimer: Some minor steps and graphical elements in this video differ from your Cloud Pak for Data deployment. This video shows the Cloud Pak for Data as a Service user interface.

This video provides a visual method as an alternative to following the written steps in this documentation.

Try a tutorial to mask data

In this tutorial, you will complete these tasks:

  1. Add a data asset to your catalog.
  2. Create a data protection rule to obfuscate data.
  3. Create a data protection rule to redact data.
  4. Create a data protection rule to substitute data.
  5. View the masked data.

This tutorial will take approximately 20 minutes to complete.

Prerequisites

Task 1: Add the data set to your catalog

The data set you will use in this tutorial is available in the Gallery.

  1. Download the Auto Insurance Customers data set (4KB).
  2. From the navigation menu, click Catalogs > All catalogs.
  3. Open your catalog.
  4. From the Assets page of a catalog, click Add to Catalog > Local files.
  5. On the Add data assets from local files page, click browse, select the AutoInsuranceCustomers.csv from your computer, and click Open.
  6. Click Add. Stay on the page until the load completes.
  7. When the file loads, open the AutoInsuranceCustomers.csv data asset.
  8. Click the Asset tab to preview the data.
  9. Select any of the columns, and then use the arrow keys on your keyboard to scroll to the right to see all of the columns in the data set.
  10. Look at the National_ID, CreditCard_Number, and Marital_Status columns. These columns contain sensitive data that needs to be masked.

Task 2: Create a data protection rule to obfuscate data

The first data protection rule will obfuscate government identities such as US Social Security Number.

  1. Click the Navigation Menu icon, and click Governance > Rules.
  2. From the rules page, click Add rule > New rule. You will first create a data protection rule that obfuscates government identity data which hides the values of the data but preserves the format.
  3. Select Data protection rule, and click Next.
  4. In the New data protection rule page that opens, complete the basic fields.
  5. In the Name section type Obfuscate government identity.
  6. In the Business definition section type Rule to mask sensitive information.
  7. For Condition 1, in the If field, select Data class.
  8. In the Search for a data class field, type US Social Security Number, and select US Social Security Number from the list.
  9. Click Add new condition.
  10. Change And to Or.
  11. For Condition 2, specify Data class and Canadian Social Insurance Number.
  12. Click Add new condition.
  13. For Condition 3, specify Data class and UK National Insurance Number.
  14. For the Action, select mask data.
  15. For the in columns containing field, select Data class. The same three data classes previously selected are filled in for you.
  16. For the masking method, select Obfuscate.
  17. Click Create.

Task 3: Create a data protection rule to redact data

The second rule will redact personal demographic information such as ethnicity.

Note: You will need the Watson Knowledge Catalog Standard or Professional plan to create more than one data protection rule.

  1. From the rules page, click Add rule > New rule. You will next create a data protection rule that redacts personal demographic data which hides the value of the data by replacing it with ten X characters.
  2. Select Data protection rule, and click Next.
  3. In the New data protection rule page that opens, complete the basic fields.
  4. In the Name section type Redact personal demographic information.
  5. In the Business definition section type Rule to mask sensitive information.
  6. For Condition 1, in the If field, select Data class.
  7. In the Search for a data class field, type Religion, and select Religion from the list.
  8. Click Add new condition.
  9. Change And to Or.
  10. For Condition 2, specify Data class and Ethnicity.
  11. Click Add new condition.
  12. For Condition 3, specify Data class and Legal Marital/Civil Status.
  13. Click Add new condition.
  14. For Condition 4, specify Data class and Political Party.
  15. For the Action, select mask data.
  16. For the in columns containing field, select Data class. The same three data classes previously selected are filled in for you.
  17. For the masking method, select Redact.
  18. Click Create.

Task 4: Create a data protection rule to substitute data

The third rule will substitute financial account data such as credit card numbers.

Note: You will need the Watson Knowledge Catalog Standard or Professional plan to create more than one data protection rule.

  1. From the rules page, click Add rule > New rule. You will next create a data protection rule that substitutes data which hides the values of the data by replacing them with hash values.
  2. From the rules page, click Add rule > New rule.
  3. Select Data protection rule and click Next.
  4. In the New data protection rule page that opens, complete the basic fields.
  5. In the Name section type Substitute financial account data.
  6. In the Business definition section type Rule to mask sensitive information.
  7. For Condition 1, in the If field, select Data class.
  8. In the Search for a data class field, type Credit Card Number, and select Credit Card Number from the list.
  9. Click Add new condition.
  10. Change And to Or.
  11. For Condition 2, specify Data class and Account Number.
  12. For the Action, select mask data.
  13. For the in columns containing field, select Data class. The same three data classes previously selected are filled in for you.
  14. For the masking method, select Substitute.
  15. Click Create.

(Optional) Task 5: View the masked data

Note: If you are the owner of AutoInsuranceCustomers.csv, you must log in as a different user to view the masked data.

Now that the rules are in place, you view the masked data from the perspective of a different user.

  1. Click the Navigation Menu icon to open the navigation menu, and click Catalogs > All catalogs.
  2. Select the catalog that contains the AutoInsuranceCustomers.csv data asset.
  3. From the Data asset page of AutoInsuranceCustomers.csv, click the Asset tab.
  4. Horizontally scroll through the columns of data and move your mouse cursor over the Lock icons above each data column to view the data protection rule that is masking the data.

Next steps

Now the data is ready to be used. For example, you or other users can do any of these tasks:

Additional resources