The interactive Monitor III reporter runs in a TSO/E session under ISPF and provides system or
sysplex performance reports in the following ways:
Displays your current system status in real-time mode
Shows previously collected data that is still available in either in-memory buffers or
pre-allocated VSAM data sets
You can use Monitor III to quickly identify storage and processor delays for your active Spark
workload. To start an interactive Monitor III session, enter the TSO/E command
RMF and select Monitor III from the RMF - Performance Management panel. From the RMF III Primary Menu, you can select the specific performance metrics that you want to see. To
further filter the report by job class and service class, you can issue the following command:
report_namejob_class,service_class
where:
report_name
The short name of the report
job_class
One of the following job class names:
ALL (or A)
ASCH (or AS)
BATCH (or B)
OMVS (or O)
STC (or S)
TSO (or T)
service_class
The service class name
Example: To get the Storage Delays report for the ODASSC1 OMVS service class, enter this command:
STOR O,ODASSC1
The following Monitor III reports are of particular interest for monitoring Spark workloads:
Storage Delays report
Common Storage report
Storage Frames report
Storage Memory Objects report
Processor Delays report
Processor Usage report
zFS File System report
Based on the performance measurements that you observe from these reports, you can fine tune the
resource assignments for your Spark workload. For instance, you
can modify the number of cores and amount of memory for your executors in the
spark-defaults.conf configuration file. (For more information, see Configuring memory and CPU options.) Or, if you use WLM to manage your Spark workload, you can adjust the
importance and performance goals of your Spark workload. (For more
information, see Configuring z/OS workload management for Apache Spark.)
Storage Delays report
The Storage Delays report (STOR) displays storage
delay information for all jobs. Here you can find out if your Spark jobs suffer any delays due to
memory constraints. A non-zero value in the DLY % column indicates that there is
a delay due to memory constraints.Figure 1 shows
an example of this report.
Common Storage report
The Common Storage report (STORC) provides
information about the use of common storage (CSA, ECSA, SQA, and ESQA) within a system. You can use
this report to identify whether Spark is using an excessive amount
of common storage (such as for memory-mapped files). Figure 2 shows an example of this report.
Storage Frames report
The Storage Frames report (STORF) displays detailed
frame counts, auxiliary slot count, and page-in rate for each address space. For instance, it tells
you the average number of frames used by each Spark process (ACTV column) and the paging rate (PGIN RATE column). Keeping the
paging rate as close to zero as possible helps improve performance. For instance, increasing the
memory limit for the resource group with which Spark address spaces are associated
may help lower the paging rate. Figure 3 shows an
example of this report.
Storage Memory Objects report
The Storage Memory Objects report (STORM)
displays information about the use of memory objects for each active address space and within the
system. A memory object is a contiguous range of virtual addresses that is allocated by jobs in
units of megabytes on a megabyte boundary. This report can help you assess the total amount of
memory that Spark is using.
It also shows the fixed and pageable 1M frames used by Spark address spaces. Spark generally does not require the
use of fixed large frames, and it might have a negative impact on the overall system health if
Spark JVMs are tuned to use
them. Figure 4 shows an example of this report.
Processor Delays report
The Processor Delays report (PROC) displays all
jobs that were waiting for or using the processor during the reporting interval. Here you can see if
your Spark jobs suffer any
delays due to processor constraints. Figure 5 shows an
example of this report.
Processor Usage report
The Processor Usage report (PROCU) displays all
jobs that were using a general-purpose or special-purpose processor during the reporting interval.
You can use this report to understand the CPU usage of your Spark jobs. Combined with the
Processor Delay report, you can assess whether you need to change the performance goals or
importance of your Spark
workload. Figure 6 shows an example of this
report.
zFS File System report
The zFS File System report (ZFSFS) measures zFS
activity on the basis of single file systems. With this report, you can monitor the I/O rates and
response times associated with the file systems that Spark uses. Figure 7 shows an example of this report.