Collecting data

This topic describes how to collect the performance data that is used by Buffer Pool Analyzer. It describes two methods to collect buffer pool trace data. The first method uses ISPF and the Collect Report Data (CRD) function to configure and control a collect task, the second method uses a batch job that contains equivalent specifications for a collect task.

About this task

For the sake of completeness, note that the Generalized Trace Facility (GTF) and the System Management Facility (SMF) can also collect buffer pool related trace data. The data is recorded in appropriate GTF and SMF data sets, which can be used as alternative or additional input for the creation of activity reports and bpd files. In Creating activity reports and bpd files, the description of the INPUTDD statement provides more details about specifying alternative or multiple input data sets. However, the important point is that GTF or SMF must be set up in SYS1.PROCLIB to collect, besides others, also buffer pool related data (as specified in Determining what to collect).

Related tasks:
General remarks:
  1. Ensure that your output data sets are large enough. The amount of data that is being collected depends largely on the activity in the buffer pools. If you are going to collect detail data, remember that each activity produces at least one trace record. On a busy system you can rapidly generate several million records. Limit the data collection time, or the number of records to be collected, until you have a feeling about the amount of trace data being produced on your system.
  2. If you are going to collect data for optimizing the object placements, ensure that the Db2 catalog statistics are up to date. Among other factors, Buffer Pool Analyzer considers the size of page sets and might otherwise produce inaccurate results. Run the RUNSTATS utility, if required.
  3. If you are going to collect data for simulation:
    • Ensure that you collect detail data, in short format, for approximately 20 minutes continuously, which generally gives a good representation of a particular workload. If the workload varies significantly, collect a slightly smaller trace for each workload type.
    • For large amounts of data you can optionally create an additional output data set that contains the collected data in compressed format. The size and the download time of such data sets are roughly 25 percent of the equivalent uncompressed data sets. The simulation function can handle both types. See The TRSMAIN terse utility for more details.

      Note that an uncompressed data set is always created. Therefore, if you choose to create the additional compressed data set, you should have approximately 1.25 times the required disk space available. However, if the data is exclusively used for simulations, you can erase the uncompressed data set after both data sets are created.

    • Avoid collecting more than 2 GB of data. The simulation function on the client can handle trace data files of up to 2 GB (no matter whether the data is compressed or uncompressed). If you realize that the size of a trace data file on the client is too big, create and download a smaller file (less than 2 GB on the client), compare the actual sizes, and estimate the approximate maximum size of the host data set as follows:
      Size_on_host_actual       Size_on_host_max
      ---------------------  ≈  ----------------
      Size_on_client_actual           2 GB
      If necessary, collect a smaller trace to keep the trace data file below its maximal size.
  4. If you are going to collect data for object placement and simulation, ensure that all requirements in remarks 2 and 3 are met. Furthermore, it is essential that you keep the trace data file and the bpd file together. (The bpd file must be created as described in Creating activity reports and bpd files.)

The following topics provide additional information: