Using Data Privacy for Diagnostics
- SVC
- Stand-alone
- SLIP
- SYSMDUMP (from V2.5)
- Transaction (from V2.5)
Post processing is used to redact pages that have been tagged as being sensitive by the applications that created the pages, as well as untagged pages that will be scanned and detected as containing sensitive data per the Data Privacy for Diagnostics Analyzer, which requires a minimum of IBM® 64-bit SDK for z/OS® Java™ Technology Edition version 8.0. This redacted version of the original memory dump is written to a new memory dump data set without modifying the original memory dump data set. Retain both memory dumps for as long as it takes to diagnose the reported problem.
Append Dump Directory records (BLSADDIR) are removed when generating a redacted stand-alone memory dump. Additional processing is required for stand-alone memory dumps that contain captured memory dumps. If the captured memory dumps are required by vendors, the memory dumps must first be extracted (IPCS COPYCAPD) from the original stand-alone memory dump, then processed separately. Do not depend on captured memory dumps being available within a redacted stand-alone memory dump.
The following functions are provided:
- REDACT
- You can redact any data that is tagged as sensitive=yes without further analysis. Note: You cannot perform the ANALYZE function on a memory dump that has already been redacted by this process.You can request this processing by using either:
- IPCS option 5.6, specifying the ANALYZE function and BYPASS DP ANALYSIS=Y
- Use sample job SYS1.SAMPLIB(BLSJDPFD)
- ANALYZE
- Any pages tagged sensitive by the applications that own that data as well as any untagged pages
detected as containing sensitive data. Note: You cannot perform the ANALYZE function on a memory dump that has already been redacted by this process.You can request this processing by using either:
- IPCS option 5.6, specifying the ANALYZE function and BYPASS DP ANALYSIS=N
- Use sample job SYS1.SAMPLIB(BLSJDPA).
- REPORT
- You can create human readable reports for a memory dump that has been processed by the Data
Privacy for Diagnostics Analyzer. These reports, once created, are in the
directory/reports/dump-name/run-number
directory in the file system that is used for DPA processing. You can request this processing by
using either:
- IPCS option 5.6, specifying the REPORT function.
- Use sample job SYS1.SAMPLIB(BLSJDPR).
- FEEDBACK
- You may provide feedback for a memory dump that has been processed by the Data Privacy for
Diagnostics Analyzer. After looking through the reports and understanding the pages that have or
have not been flagged as sensitive, you can provide feedback to help the Data Privacy for
Diagnostics Analyzer improve its sensitive data detection. More information is covered on providing
feedback later in this chapter. After updating configuration files and indicating what tagging can
be improved, you can request this processing by using either:
- IPCS option 5.6, specifying the FEEDBACK function
- Use sample job SYS1.SAMPLIB(BLSJDPF).
- INGEST
- You may ingest data to help the Data Privacy for Diagnostics Analyzer determine what sensitive
data exists in your environment. Data can be ingested from dictionaries, databases, or other
sources. This data is added to the knowledge base information and will be used in future analysis
runs. More information is covered on providing ingested data later in this chapter. After updating
configuration files and indicating what tagging can be improved, you can request this processing by
using either:
- IPCS option 5.6, specifying the INGEST function
- Use sample job SYS1.SAMPLIB(BLSJDPI).
- EXTRACT
- You can extract any built-in or custom identifiers from the Analyzer to a file so that the user
may see the exact criteria for determining the sensitivity of the data with the ANALYZE function.
The output file contains either the pattern or entire dictionary depending on the type of identifier
to help ensure that the Data Privacy for Diagnostics Analyzer is correctly marking data as sensitive
or nonsensitive. More information is covered on extracting identifiers later in this chapter. After
updating configuration files and indicating which identifiers can be written to a file, you can
request this processing by using either:
- IPCS option 5.6, specifying the EXTRACT function
- Use sample job SYS1.SAMPLIB(BLSJDPX).
Generally, you want to start by performing the ANALYZE function on a memory dump. This function works only on memory dumps captured on a z15 or later processor. After creating the redacted version of the memory dump, you will want to check the memory dump to understand what has been redacted. Reports are available to help you understand why pages have been redacted. You can look at these reports to see whether the data has been properly identified as sensitive. Some reports are written in concise form and must be formatted by using the REPORT function. After running the REPORT function, you may want to give feedback to Data Privacy for Diagnostics Analyzer regarding some of the data that it either found as sensitive but was not sensitive, or feedback on data that was sensitive but not detected as sensitive. The FEEDBACK function allows you to perform this task. The cycle of ANALYZE / REPORT / FEEDBACK provides a way to train the Data Privacy for Diagnostics processing in order to produce memory dumps with the right level of redaction for your environment.
Another function that can be used is the INGEST function. This allows you to import data from
databases and files, and lets you create custom information that can be used by the Data Privacy for
Diagnostics Analyzer processing to help identify sensitive data.
The creation of
custom identifiers that are tailored to an installation's data privacy requirements is imperative to
attain the most accurate redaction of SPI (or other sensitive information); far surpassing the
redaction with using the generic built-in identifiers provided with the
Analyzer.
In order to display the exact criteria that the ANALYZE function is using to determine data sensitivity, one might use the EXTRACT function to write out any built-in or custom identifiers to a file such that when that particular identifier is requested in the ANALYZE configuration, the user knows exactly which tokens or what pattern will be used to mark data as sensitive or nonsensitive.

Using the Data Privacy for Diagnostics Analyzer Dialog within IPCS
When IPCS is used, panels are presented to allow you to specify parameters required for processing. The dialog generates appropriate JCL based on the parameters provided. If any data sets are required but not preallocated, the dialog attempts to dynamically allocate them. If dynamic allocation fails for any reason, you should be able to preallocate data sets by using other mechanisms (such as ISPF option 3.2).
The parameters that are specified on the IPCS Data Privacy for Diagnostics Analyzer panels are:
- DATA SET NAME
- The input memory dump data set name. This option is equivalent to the input_dataset parameter in the JCL submitted to perform the requested function.
- NEW DATA SET NAME
- The output (redacted) memory dump data set name. This option is equivalent to the output-dump-dataset field in the JCL submitted to perform the ANALYZE function.
- TEMP DATA SET/PAT
- Temporary data set names can either be a specific name or a data set name pattern. For more information on patterns, see the help pages. This option is equivalent to the output_dataset or output_dataset_prefix parameters in the JCL submitted to perform the requested function.
- BYPASS DP ANALYSIS
- Allows you to submit a job that will either perform analysis (N) or skip analysis (Y). If N is specified, the Data Privacy for Diagnostics Analyzer step scans the input data set looking for additional sensitive data in addition to data identified by the applications that allocated the storage marked as sensitive. If found, either token-level or page level redaction is performed based on the Allow Page Level specification. If Y is specified, this step is bypassed. The output data set identified by the NEW DATA SET NAME field will only remove data that is identified by the applications that allocated the storage marked as sensitive.
- REDACTION STRING
- If you are not allowing page level redaction, this redaction string is used to overlay data that is determined to be sensitive in the output memory dump. You may leave this field blank to overlay the token with X or specify a string. When longer strings are detected in the pages, the string is used in a repeated fashion. If shorter strings are found, only a portion of the redaction string may be used. This option is equivalent to the redaction_string parameter in the JCL submitted to perform the requested function.
- NUMBER OF THREADS
- For ANALYZE requests, large memory dumps may be processed faster by using multi-threading. You may specify 1 to 8 for the number of threads. Each thread requested processes a portion of the input memory dump, reducing the elapsed time that it takes to process the entire memory dump, however, it may also increase the simultaneous amount of resources that are required to process the request. This option is equivalent to the thread_count parameter in the JCL submitted to perform the requested function.
- ALLOW PAGE LEVEL
- If Y is specified, known as fast-analysis mode or page-level redaction, the entire page of storage is redacted when any sensitive data is detected. Page-level redaction may allow the analysis processing to run faster since processing stops at the first sensitive string in a page is found, however, it is possible that allowing page-level redaction may cause diagnostic data to be lost. If you find this to be true, set the value to N, known as detailed analysis mode or token-level redaction, so that data that is determined to be sensitive will be overlaid by using only the redaction string. The default value is N or token-level redaction.
- SENSITIVE REPORT
- If Y is specified, reports are generated in directory/reports/dump-name/runnumber/sensitive_token_log_n where n is the thread number. There will be a file per thread requested. For each string detected, data is written to these files to help you understand what has been redacted and why. Based on this information, you may decide to include or exclude types of data. When the REPORT function is requested, it consolidates these sensitive_token_log_n files into a human-readable file named sensitive_tokens.
- DPfD HOME DIR
- Specify the path where the Data Privacy for Diagnostics Analyzer home directory is configured, directory as previously described. Do not include the trailing '/' when specifying this path.
- JAVA HOME DIR
- Specify the path where Java is installed. This is used in the batch job's STDENV set up file to create the proper environment for the Java processing to run in. Do not include the trailing '/' when specifying this path. Data Privacy for Diagnostics requires a minimum of IBM 64-bit SDK for z/OS Java Technology Edition version 8.0.
- JAVA OPTIONS
- You may provide whatever Java options are wanted. For
example, you may need to specify a minimum and maximum heap size for the JVM to successfully run a
multi-threaded DPfD ANALYZE request. Using the default setup with only built-in identifiers, each
thread requires approximately 512 MB to successfully load data for the run. Requesting additional
threads or including additional identifiers increase the size of the heap for the JVM, so use the
-Xms and -Xmx options to adjust the minimum and maximum heap size.
For more information about JVM Command-Line Options, see the topic OpenJ9 command-line options in
IBM SDK, Java Technology Edition 8.0.0
Data Privacy for Diagnostics requires a minimum of IBM 64-bit SDK for z/OS Java Technology Edition version 8.0.
If you are using IBM Semeru Runtime
Certified Edition for z/OS 21 or later, the file encoding for Data Privacy for Diagnostics files
must be specified explicitly via the following parameter due to changes made to the default file
encoding: -Dfile.encoding=IBM-1047. For more information, see IBM Semeru Runtime Certified Edition for z/OS 21.
- JZOS LOAD MODULE
- The dialog uses the JZOS Batch Launcher in the JCL that is submitted. Determine the correct level of JZOS installed on your system and provide the name of the appropriate load module in this parameter. Data Privacy for Diagnostics requires a minimum of IBM 64-bit SDK for z/OS Java Technology Edition version 8.0, thus the 64-bit version 8 load module for JZOS Batch Launcher is JVMLDM86. For more information, see the JZOS Batch Launcher and Toolkit Installation and Users Guide.
- MIGLIB DATASET
- A sort E35 exit is used to remove pages that are flagged as sensitive. This function is provided in module BLSRTE35, which is included in SYS1.MIGLIB. Should you need to override where this exit can be loaded from, provide the name of the MIGLIB that contains the load module you want to run.
- TEMP ALLOC PARMS
- If your environment requires specific allocation parameters for memory dump data sets, you may
supply any allocation parameters that ensure that the data set is properly allocated. For example,
supplying DATACLAS and STORCLAS keywords may be necessary to locate the correct storage pool and
attributes.
Do not specify RECFM, DSORG, LRECL, BLKSIZE, SPACE, and TRACK as they are used to create some of the interim data sets. If you need to use one of those allocation parameters, request the ANALYZE function by the JCL instead of through IPCS.
- EDIT CONFIG FILE?
-
If Y, allows the user to edit the configuration file pertaining to the function requested (analysis_config.json for ANALYZE or ingestion_config.json for INGEST or extract_config.json for EXTRACT) before submitting the JCL to perform the requested function. Default is N. For more information, see the analysis_config.json, extra_config.json, and ingestion_config.json sections.
- RUN NUMBER
- From the ANALYZE step, a run number was generated and can be found in the job output, which can be specified for this parameter when the function requested is REPORT or FEEDBACK. If a run number is not specified, the most recent ANALYZE run for the input memory dump is used.
- DB2® JDBC PATH
- For the INGEST function, if using a Db2® connection source in the ingestion_config.json file, this field is needed to specify the path for the Db2 JDBC Driver and License JAR files. Do not include the trailing '/' when specifying this path.