Performance Tuning and Query Optimization

White Papers

Abstract

This article details various techniques for managing performance across the DVM server, parallelism, zIIP utilization, and query execution. The article can refer to other supporting published documents, such as Capacity Planning and Deployment.

Content

Performance Tuning and Query Optimization

The DVM server optimizes the allocation and use of resources for processor types (General, zIIP) and system memory. Every resource has an impact on the environment in how it is used when running various of workloads.

Parallel I/O, MapReduce capability, block-fetch, and memory caching all require memory, to obtain optimal performance for workloads. For example, the maximum no. of parallel threads possible for a DVM server depends on the no. of zIIP specialty engines available, in combination with available system memory. Similarly, block-fetch of data into system memory requires adequate allocation of cache to improve execution time in accessing data in memory versus more I/O cycles associated with data retrieval from disk.

A general slide rule applies between zIIP specialty engines and General Processors. DVM allows for the ability to throttle processing between two types of processors, whereby zIIP engine processing is restricted. An increase of processing naturally occur on the General Processors. To reduce the MSU consumption for the system, your environment needs to ensure you have adequate zIIP engines to shift workloads and reduce the overall costs for processing.

The best recommendation for an initial installation is to focus on a standard resource allocation of two zIIP engines and 32 Gigabytes of memory for a 1-2 General Processor configuration. Starting with a balanced resource plan simplifies monitoring resource allocation that use SMF72 record types. The Resource Group data section of the SMF 72 record provides information about MIN/MAX capacity across various resource groups.

% LPAR share
% Processor capacity
Memory limit
MSU/H

Combined GP and zIIP consumption
As workloads are introduced through the DVM server, adjustments to resources can be made to allocate processing and memory to ensure optimal performance. Test results conducted at the IBM Systems Benchmark Center demonstrating the direct impact increasing numbers of zIIP engines have on parallelism. Performance improves significantly with reduced elapsed time to execute workloads, as shown in Table 1 and Table 2.

Server	Total CPU Time	Total zIIP Time	Total IICP Time	Total zIIP NTime	% zIIP eligible
DVM1	7099.33	5609.55	1389.58	5609.55	98.59%

Table 1. zIIP engine exploitation

Test #	GPPs	# zIIP Engines	Degree of Parallelism	Elapsed Time (ms)	SMT
1	8	0	0	118.96	1
2	8	5	0	98.68	1
3	8	5	4	27.05	1
4	8	5	8	17.14	1
5	8	5	8	20.84	2
6	8	5	10	17.00	2
7	8	5	16	15.73	2
8	8	8	8	13.83	1
9	8	8	8	17.62	2
10	8	8	16	11.72	2

Table 2 - Performance with parallelism and zIIP engine exploitation

This test was run against an older z13 machine that uses 800 Gigabytes of financial data. The test achieved approximately 99% offload to zIIP specialty engines. However, tests #2 and #4 execute against identical system resources where the degree of parallelism is set to a value of 8, resulting in a reduction of elapsed time from 98.68 minutes to 17.14 minutes.

Increasing the number of zIIP specialty engines from 5 - 8 further reduced the overall elapsed time to 13.83 minutes. Increasing both the number of zIIP engines and the degree of parallelism within the DVM Server can result in performance improvements up to 1,000% for elapsed times. The recommendation is to start small and expand the use of more resources to meet business objectives around latency and execution time.

Parallel I/O and MapReduce

DVM for z/OS optimizes performance by using a multi-threaded z/OS-based runtime engine that leverages parallel I/O and MapReduce capabilities to simultaneously fetch data from disk or memory. Figure 1 illustrates the DVM server parallelism by using MapReduce.

Figure 1 - DVM server parallelism achieved using MapReduce

MapReduce reduces query elapsed time by splitting queries into multiple threads that read large files in parallel. Each interaction, whether initiated by a distributed data consumer or application platform, runs under the control of a separate z/OS thread. A thread is a single flow of control within a process.

The DVM Server can deliver predicate pushdown and filtering to all data sources and supports heterogeneous JOINs with associated pushdown processing of filters for subqueries and their respective sources.

Virtual Parallel Data (Cache)

Virtual Parallel Data (VPD) essentially provides a means to cache data into defined members for faster and more optimal subsequent queries. Similar to other cache mechanisms, data is initially populated and persisted as a materialized view that can repetitively access by client applications and refreshed as needed to maintain currency for the business. This cache benefits by having to perform disk I/O once to populate the cache, therefore, reducing associated expense.

Figure 2 - DVM Virtual Parallel Data

VPD allows applications or users to group multiple simultaneous requests against the same data source in parallel without subsequent data refreshes. This functionality also allows single or multiple requests to run with asymmetrical parallelism, separately tuning the number of I/O threads and the number of client or SQL engine threads.

Using Virtual Parallel Data groups

All requests submitted against the same DVM Server instance need to refer to a group name. VPD groups also have a predefined amount of time for the group to persist. One or more I/O threads begin to read the data source and then write to a circular buffer assigned to the VPD group. Group members share buffered data without having to read directly from the disk.

I/O threads are started as soon as the VPD group is created, and data begins to flow to the buffer. If the buffer fills before the group is closed, the I/O threads wait. Once the group is closed and active members begin consuming data, the buffer space is reclaimed, and I/O continues.

Example
A large PS (single flat file) is on disk that the customer doesn’t want to read more than once for operational and performance reasons. Instead, the customer wants to import the content of the file into multiple data stores. Consuming applications must identify with the VPD group, to use it. Each VPD request can specify its own degree of parallelism.

VPD can be used to access multiple sources using various client applications and drivers.

Supported Data Sources	Supported Client Access
Adabas IDMS IMS VSAM Logstreams MQ Series Sequential Tape zFS	Batch DSSPUFI JDBC ODBC IzODA (Spark Scala, Spark Java, Python DB API) Db2 Query Manager Facility IDAA Loader

Table 2 - Supported Data Sources and client access using VIPA

Considerations and Limits

If VPD is not used, then each client has to create a separate virtual table to access the same set of data from the source. With VPD, DVM creates one virtual table for accessing the data by all the clients.
The end-to-end read operation has to be carried out by each client application. With VPD, the data is read from the source once and the same data is consumed by each client application in parallel.
Input devices like tape can be read-only serially and hence parallelization is not possible. Each of the client applications has to read the data from source to destination serially.
With VPD, the DVM can read the data from tape serially once whereas the client applications can access the data in parallel from DVM buffers.
INPUT devices like Tape can be read-only serially and hence parallelization is not possible. Each of the client applications has to read the data from source to destination in serial. With VPD, the DVM can read the data from tape serially once whereas the client applications can access the data in parallel from DVM buffers.
Client applications that need to read data in a specific order, have to read the data from an original data source in its' specific order. Even when the data can be read from the source in parallel, VPD allows the data to be read into DVM buffers in parallel. Client applications can then read the data from buffers in any specific order.

Workload Management

Workload management is critical to ensure optimal performance across different workloads. Performance goals for various workloads can be defined using the IBM Workload Manager for z/OS (WLM) with the ability to further assign the required importance of each workload in business terms. Resources are assigned to specific work items to determine the ability to attain goals through continuous monitoring where resource adjustments are made to achieve wanted business objectives.

IBM Z resources are assigned based on goals defined in the IBM Workload Manager shown in Figure 4.

Figure 4 - Resource priority assignments by WLM

Business priority can be specified for the WLM services in the DVM Server. Specific service classes are used to inform the operating system of specific performance goals and priority levels shown in Figure 5.

Figure 5 - Resource priority assignments by WLM

In the service class, you assign each goal and its relative importance and associate the service class with a specific workload and resource group. The DVM Server uses the following service classes:

AVZ_SCHI ZIIPCLASS=AVZ High priority. This service class is for IBM Data Virtualization Manager for z/OS critical work. Assign this class goal as close to SYSSTC as possible.
AVZ_SCNM ZIIPCLASS=AVZ Normal work. This class is for IBM Data Virtualization Manager for z/OS administrative work. Assign this class the same priorities as those used for DB2 or the IMS control region.
AVZ_SCTX ZIIPCLASS=AVZ Client work. This service class is for client requests. Assign this class the same goals as those supporting the data source for the CICS, IMS/TM or DB2 WLM address space.

To enable the WLM policy for the DVM Server, the DVM user ID (default: AVZS) can have UPDATE access to MVSADMIN.WLM.POLICY. If the WLM policy is not defined for the DVM server, then WLM assigns the lowest service class SYSOTHER to DVM workloads, which negatively impacts the DVM Server performance.

Configuring WLM for the DVM Server

The Workload Manager policy can be enabled in AVZSIN00 configuration members.

If DoThis then

“MODIFY PARM NAME(WLMFORCEPOLICY) VALUE(YES)”

“MODIFY PARM NAME(WLMTRANNAME) VALUE(APPLNAME)”

“MODIFY PARM NAME(WLMUSERID) VALUE(AVZS)”

End

WLM definitions can be set manually, as well: https://www.ibm.com/support/knowledgecenter/en/SS4NKG_1.1.0/havada10/topics/dvs_sg_con_workload_manager.html

Working with multiple DVM servers

To handle more workloads and ensure organizational service level objectives, more DVM servers can be instantiated and the server workload can be balanced across multiple DVM servers. Load balancing allows inbound connections to be automatically directed to the DVM Server instance that has the most available resources for the number of connections. The overall availability of virtual storage (less than or greater than 16 Megabytes reference point) determines which DVM Server instance handles an individual request.

Managing Workload within a single LPAR

Load balancing is transparent to the client application. Client applications use a port number to connect to a DVM Server, which then performs Port sharing to route a request to a more optimal DVM Server as needed. TCP/IP's SHAREPORT or SHAREPORTWLM is the recommended approach to load balance workload across multiple DVM Servers within a single LPAR.

Managing Workload over multiple LPARs, Regions or Sysplex

When balancing workload across a Sysplex, Dynamic Virtual IP Addressing (DVIPA) can be leveraged as it provides workload balancing and failover for applications that use TCP/IP services. With SHAREPORT, SHAREPORTWLM, and DVIPA, all the balancing is done at the TCP/IP layer and the Server is unaware that any balancing is taking place. The load balancing of CICS regions is handled differently in DVM servers using LOADBALGROUP parameter.

Using SHAREPORT and SHAREPORTWLM

Port sharing is a method to distribute workloads for DVM Servers within a z/OS LPAR. TCP/IP allows multiple listeners to listen on the same combination of port and interface. Workloads destined for this application can be distributed among the group of DVM Servers that listen on the same port. Port sharing does not rely on an active sysplex distributor implementation; it works without a Sysplex distributor. Port sharing can be used in addition to sysplex distributor operations. z/OS currently supports two modes of port sharing - SHAREPORT and SHAREPORTWLM.

SHAREPORT

Incoming client connections for a configured port and interface are distributed by the TCP/IP stack across the listeners that use a weighted round-robin distribution method based on the Server accept Efficiency Fractions (SEFs). The SEF is a measure of the efficiency of the server application, calculated at intervals of approximately one minute, in accepting new connection requests and managing its backlog queue shown in Figure 6.

Figure 6 - SHAREPORT configuration

SHAREPORTWLM

You can use the SHAREPORTWLM option instead of SHAREPORT. Similar to SHAREPORT, SHAREPORTWLM causes incoming connections to be distributed among a set of TCP listeners. However, unlike SHAREPORT, the listener selection is based on WLM server-specific recommendations, modified by the SEF values for each listener. These recommendations are acquired at intervals of approximately one minute from WLM, and they reflect the listener’s capacity to handle additional work.

Figure 7 - SHARPORTWLM configuration

WLMHEALTHREPORT, SHAREPORTWLM, and the DVM Server

The DVM Server reports a Health Status to WLM to allow WLM to better manage which DVM Server to pass an inbound connection when using SHAREPORTWLM.

The WLMHEALTHREPORT parameter must be set to YES, which is the default. An informational message is available with the Server WLMHEALTHREPORT support to indicate when the health status of the DVM Server changes.

Example

VDB1DB0724I Server VDB1 WLM health changed from 100% to 90%.

Depending on the severity or frequency of errors, the health setting % can be reduced further and extra messages issued. Once the health of the Server increases an extra message is generated similar to the following.

Example

VDB1DB0724I Server VDB1 WLM health changed from 90% to 100%.

The Server Trace Browse can be used to further investigate abrupt changes in WLM Health Status. No change is required for the DVM Server configuration member AVZSIN00 to support SHAREPORT or SHAREPORTWLM.

Sysplex Load Balancing with DVIPA

The “distributor stack” is a network-connected stack that owns a specific VIPA address and acts as the distributor for connection requests. The target stack is the owner of DVM Server instances, to which the distributing stack forwards the requests. Together, they are called “participating stacks” for the sysplex distributor.

All participating z/OS images communicate through XCF, which permits each TCP/IP stack to have full knowledge of IP addresses and server availability in all stacks. Sysplex “distributor” for z/OS-integrated intra sysplex workload balancing of DVM Servers is depicted in Figure 8.

Figure 8 - Sysplex Load balancing using DVIPA

Sysplex “distributor” provides an advisory mechanism that checks the availability of DVM Servers running on separate z/OS images in the same sysplex and then selects the best-suited target server for a new connection request. The Sysplex distributor bases its selections on real-time information from IBM Workload Manager (WLM). Sysplex distributor also measures the responsiveness of target servers in accepting new TCP connection setup requests, favoring those servers accepting new requests.

When the selection of the target stack is complete, the connection information is stored in the sysplex distributor stack to route future IP packets that belong to the same connection as the selected target stack. Routing is based on the connection (or session) to which IP packets belong, which is known as connection-based routing.

Configuring DVIPA and Sysplex Distributor are done within the TCP/IP stack itself and there are no components in DVM server, which need to be configured. Once the Sysplex has been configured to enable dynamic linking, any inbound connections can issue a CALL to WLM to check which is the ideal stack to route the connection to.

Configure DVIPA by using IBM documentation and define a VIPADYNAMIC section in the TCP/IP profile:

VIPADYNAMIC

VIPADEFINE 255.255.255.0 10.17.100.60

VIPADISTRIBUTE

DISTMETHOD BASEWLM 10.17.100.60

PORT 2200

DESTIP

192.168.1.1

192.168.1.2

ENDVIPADYNAMIC

10.17.100.60 is the DVIPA address. The further definition (DISTMETHOD BASEWLM) states to do WLM Distribution of all inbound requests coming into port 2200. Plan on having one DDVIPA address and one non-DDVIPA address for those applications and connections not needing broadcast. To have DVM Server’s TCP/IP listener listen on two IP addresses (one DDVIPA and one non-DDVIPA), the following “extra” parameters need to be set:

IF DoThis then

“MODIFY PARM NAME(ALTERNATEIPADDRESS1) VALUE(10.17.100.60)”

“MODIFY PARM NAME(DVIPABINDALL) VALUE(YES)”

END

It is recommended to use the ALTERNATEIPADDRESS1 as the DDVIPA address. ALTERNATEIPADDRESS2 is another optional parameter that can be used to specify a third IP address.

Load Balancing with CICS regions

The DVM Server manages Load balancing for CICS regions using the LOADBALGROUP parameter in the CICS definition for the CICS server configuration member. Define the following statements in IN00:

“DEFINE CONNECTION NAME(AAAA)”,”GROUP(AAAA)”, “ACCESSMETHOD(IRC)”,”NETNAME(SDBAAAA”, “INSERVICE(YES)”,”PROTOCOL(EXCI)”, “APPLID(CICSJ)”,”LOADBALGROUP(LBG1)” “DEFINE CONNECTION NAME(BBBB)”, “GROUP(BBBB)”,”ACCESSMETHOD(IRC)”, “NETNAME(SDBBBBB)”,”INSERVICE(YES)”, “PROTOCOL(EXCI)”,”APPLID(CICSL)”, “LOADBALGROUP(LBG1)”

This causes DVM Server to decide which CICS region belongs to the same LOADBALGROUP (Either CICSJ or CICSL) to send the request to (usually round-robin). If one CICS belonging to the LOADBALGROUP becomes INACTIVE (For example CICSJ), then the DVM Server sends a new CICS request to the other CICS (CICSL) which is part of the same LOADBALGROUP. There could be many CICS regions for the same LOADBALGROUP.

Db2-Direct and IMS-Direct

The Db2-Direct and IMS-Direct are features provided by the DVM Server to directly access the backend datasets of Db2 for z/OS and IMS by bypassing the respective database managers for improved performance and reduced elapsed time. This feature can be used for READ-ONLY operations, which do not require data integrity of the latest database UPDATE operations similar for analytical queries. Figure 9 illustrates this bypass of the underlying database subsystems for I/O.

Figure 9 - DVM server bypass of database I/O subsystems

Db2-Direct

Db2-Direct is a DVM Server access method that reads Db2 VSAM linear datasets directly, instead of accessing the data through traditional Db2 APIs. Large data pulls can be performed in service request block (SRB) mode with MapReduce and Virtual Parallel Data features without any prerequisite processing, such as the collection of statistics that use the DVM command DRDARange. Db2-Direct allows READ-ONLY access to the data and it provides a significant benefit in performance and reduced elapsed time in processing analytical queries.

Transactional integrity is not guaranteed due to the absence of record-level locking during reading activity. Security is managed by using Db2 table authorization. If Db2-Direct table security is disabled with the DISABLEDB2DIRSEC parameter in the INOO, unauthorized users by using Db2 APIs can gain access to Db2 data.

Considerations and Limitations:

Db2-Direct does not support tables with edit procedures or SQL statements containing joins, LOB columns, or key columns.
The Db2 subsystem hosting a Db2 table must be active when Db2-Direct enabled tables are loaded or refreshed. The MAP building requires Db2 system access to identify data set information in the Db2 system catalog.
The DVM Server requires READ access to the Db2 VSAM linear datasets and that datasets are available during map load or refresh for the virtual table.
Virtual tables enabled for Db2-Direct must include all the columns defined in the base Db2 table as the column information is not available while loading directly from DB2 VSAM linear datasets.
If Db2-Direct table security is enabled, the Db2 subsystem must be available to check security at SQL query time.
If Db2-Direct security is disabled, the CCSIDs of table columns assumed based on the ENCODING_SCHEME (EBCDIC, Unicode, ASCII) of the table.
Users can check the DVM server trace logs for the following messages to confirm whether DB2-Direct is enabled or used.
Startup: DB2 direct processing enabled for <map-name>
Running: Processing table that uses DB2 direct
Failure: DB2 direct processing disabled for <map-name>

By default, Db2-Direct is enabled in the DVM server. To disable the Db2-Direct feature for a virtual table, in a VTB rule, set the variable OPTBDIDD to 1. Db2-Direct can be disabled that uses the following parameters in the DVM configuration file hlq.SAVZEXEC(AVZSIN00).

Disable: “MODIFY PARM NAME(DISABLEDB2DIRECT) VALUE(YES)”

DB2-Direct configuration is available at: https://www.ibm.com/support/knowledgecenter/SS4NKG_1.1.0/havica10/topics/dvs_tsk_cfg_Db2Direct.html

IMS-Direct

The IMS-Direct feature provides MapReduce and parallelism support for accessing native IMS files. This support bypasses the requirement of having to use native IMS API calls by reading the IMS database files directly, similar to how an unload utility can work. This method provides a significant performance improvement and reduced elapsed time in processing analytical queries.

The DVM server determines the best method of access to underlying IMS data. The DVM server chooses to either activate IMS-Direct (native file support) or leverage IMS APIs. This determination is based on the database and file types supported and the size of the database. Virtual tables of IMS segments are required. IMS-Direct is supported by the following types of IMS Databases:

Hierarchical direct access method (HDAM) – VSAM and OSAM
Hierarchical indexed direct access method (HIDAM) – VSAM and OSAM
Partitioned HDAM (PHDAM) – VSAM and OSAM
Partitioned HIDAM (PHIDAM) – VSAM and OSAM
Fast Path data entry database (DEDB)

Security is managed on the IMS native data set itself when IMS-Direct is used. The USERID of the client connection must have the necessary security permissions for reading the IMS database datasets. Transactional integrity is not guaranteed due to the absence of record-level locking during reading activity.

IMS-Direct can be enabled by changing the syntax of “DontDoThis” to if “DoThis”, and then set the parameter IMSDIRECTENABLED to YES. The following parameters in the DVM configuration file hlq.SAVZEXEC(AVZSIN00) to enable IMS Direct.

IF DoThis then do

“MODIFY PARM NAME(IMSDIRECTENABLED) VALUE(YES)”

“MODIFY PARM NAME(IMSDIRECTBUFFERSIZE) VALUE(1024)”

“MODIFY PARM NAME(ACIINTSEGMP256) VALUE(200)”

“MODIFY PARM NAME(TRACEIMSDBREFRESH) VALUE(YES)”

“MODIFY PARM NAME(TRACEIMSDIRSTATS) VALUE(YES)”

“DEFINE IMSDBINFO”,

. . .

END

Enabling IMS-Direct

MapReduce must be enabled in the IN00 configuration file when IMS-Direct is turned on. If you are performing an INSERT, UPDATE, or DELETE IMS-Direct switches to DBCTL automatically even when IMS-Direct is enabled. Users can enable the trace option in the IN00 configuration file with the following parameters to confirm IMS-Direct is used while querying the table.

Enable: “MODIFY PARM NAME(TRACEIMSDIRTASKSETUP) VALUE(YES)”

IMS-Direct configuration can be found at the following Url: https://www.ibm.com/support/knowledgecenter/en/SS4NKG_1.1.0/havica10/topics/dvs_sql_tsk_cfg_DVS_IMS_Direct.html

JDBC Performance

Modern applications can connect to the DVM server that uses Java Database Connectivity (JDBC). This standard Java API is used for database-independent connectivity between the Java programming language and a wide range of databases. There are specific properties available to help influence performance characteristics. Data buffering, parallelism, and MapReduce are specific DVM server functions that can be dictated as part of the JDBC connection string used by the client application to significantly improve overall performance.

JDBC and Buffering Data

Sending large amounts of data to the client or the server, one way to optimize performance is by choosing the appropriate size of the buffer that the driver uses to send data. The JDBC driver communicates with the DVM server using the Communication Buffer (CMBU) protocol, which specifies the number of buffers to be used. The maximum size of a buffer is set using the MaximumBufferSize (MXBU) JDBC property and the NETWORKBUFFERSIZE parameter in the server configuration file. The application can use the MaximumBufferSize JDBC property to optimize the query performance. The actual buffer size value that the JDBC driver uses is the result of a handshake between the driver and server.

When the driver logs on to the server, it requests for the MaximumBufferSize. If the MaximumBufferSize is greater than NETWORKBUFFERSIZE, then the server tells the driver to use the NETWORKBUFFERSIZE in the logon response. Setting the MaximumBufferSize value, consider the distance between the client and the server as network latency can harm performance. The buffer size that works best for clients and servers that are closer in proximity (low latency) need not be the buffer size that works best when clients and servers are not near (high latency). Figure 10 illustrates how an INSERT statement is sent using multiple buffers to the DVM server.

Figure 10 - JDBC driver sending multiple buffers for INSERT

When you execute a large SQL INSERT statement or a batch of INSERT statements, the statements are divided into multiple data buffers that are no larger than the size you specify for MaximumBufferSize. The buffers are then sent to the server.

When you execute a large SQL INSERT statement or a batch of INSERT statements, the statements are divided into multiple data buffers of buffer size negotiated with the DVM server. The buffers are then sent to the server. Figure 11 shows the server sending the client rows of data in a single buffer:

Figure 11.

DVM server returns one buffer at a time

After running an SQL SELECT statement, the server divides and returns the requested rows of data in buffers that has been sized, one buffer at a time. The client can call next() on the result set until all rows are read. When the next() call no longer has a row to read from the buffer, the driver issues a request to the server for another row buffer. This cycle continues until the client reads all rows from all buffers. The following code provides a sample implementation:

Connection con = DriverManager.getConnection

(“jdbc:rs:dv://host:port;Key=Value;Key=Value;”,username,password);

Statement stmt = con.createStatement();
ResultSet rs = stmt.executeQuery(“SELECT a, b, c FROM Table1”);
while (rs.next())
{

int x = rs.getInt(“a”);
  String s = rs.getString(“b”);
  loat f = rs.getFloat(“c”);
  }

The client can experience a pause during the next() API calls when all rows in the current buffer are read and the driver fetches another buffer of data. This pause can be avoided by enabling the parallel I/O.

JDBC and Parallel I/O

When Parallel I/O is enabled, the JDBC driver creates multiple and separate threads using DVM server buffers to pre-fetch, read, and insert rows into a queue. The JDBC driver works to ensure this queue is fully used to ensure there is always at least one buffer ready for the next() API call to use. Pauses are eliminated so the client can continue to call the next() buffer in the results set. The process iterates until all rows in all buffers are read and queued.

The following illustration shows row data being sent to the client using both the main and the parallel thread. The ParallelIoBufferCount (PIOBC) property determines the number of buffers in use in Figure 12.

Figure 12.

Parallel IO pre-fetches buffers

JDBC and MapReduce

MapReduce is a JDBC driver-controlled feature that can be used to improve SQL query performance by reading results with multiple connections between the driver and server. Thus, there are two types of MapReduce.

Server-controlled MapReduce – MapReduce is performed on the server using the JDBC driver.
Client-controlled MapReduce – MapReduce is performed on the client and the JDBC driver manages single connections, whereas the client manages other connections and aggregate the results from each connection.

If you are using MapReduce with RDBMS or IMS, you must complete metadata repository configuration requirements.

Server-controlled MapReduce

The data mapping step creates threads that execute a query using more than one DVM server connection. Each thread maps one connection to one server. The reduce step reads results on each thread in parallel for each connection, and then transparently presents the data to the client in Figure 13.

Figure 13. Server controlled MapReduce with a single DVM server

Configure a single server connection by setting the MapReduceClient JDBC property:

Example

MapReduceClient = (Hostname, Port, TaskCount)

MapReduceClient= (dvs.example.com, 9999, 7)

MapReduce can also be configured to use multiple server connections. Figure 14 shows a client connected to two servers with MapReduce.

Figure 14. Server controlled MapReduce with two DVM servers

Configure multiple server connections by setting the MapReduceClient property:

Example

MapReduceClient = (Hostname1, Port, TaskCount1), (Hostname2, Port, TaskCount2)

MapReduceClient = (dvs1.example.com, 9999, 4), (dvs2.example.com, 1111, 3)

MapReduce can also be controlled to use a specific range of clients:

Example (clients 1-4 and 5-7)

MapReduceClient = (Hostname, Port, maxClientNo, startClientNo, endClientNo)

MapReduceClient = (host1, 1200, 7, 1, 4), (host2, 1200, 7, 5, 7)

Client-controlled MapReduce

In some use cases, using Apache Spark, the client application can perform the MapReduce process and still benefit from the server’s MapReduce feature. The JDBC driver supports this use case by specifying a single connection as the JDBC MapReduce connection. This connection is then available for use by a group of specified connections. The JDBC driver manages single connections. The client application must aggregate the results from each connection managed in Figure 15.

Figure 15. Client controlled MapReduce with seven parallel connections

Client-controlled MapReduce can be enabled by setting the JDBC driver properties MapReduceClientCount and MapReduceClientNumber as follows:

The MapReduceClientCount property is used to specify the total number of connections associated with the group of client connections.
The MapReduceClientNumber property defines a specific client connection within the group is managed by the JDBC driver and has a value between the number 1 and the number specified for the MapReduceClientCount property.
The JDBC driver executes queries using the single MapReduce connection for the client connection specified in the MapReduceClientNumber property.
Ddata for the specified connection are returned, as opposed to using MapReduceClientCount over one or more connections to get all rows of data.
To configure client-side MapReduce, set the JDBC driver MapReduceClientNumber and MapReduceClientCount parameters as follows:

Example
MapReduceClientNumber, MapReduceClientCount
MapReduceClientNumber=2; MapReduceClientCount=7;

The JDBC driver creates a single connection, indicated as connection 2 out of 7 available connections. Using a framework like Apache Spark, you must create and manage all remaining connections and aggregate the results from each of those connections.

ODBC Performance

The ODBC drivers are used by non-Java applications and tools to access data that is made available through the DVM server. The ODBC driver implements the ODBC Direct network protocol that is used to connect to the Data Virtualization Manager server and uses the ODBC API to execute SQL queries. The ODBC driver for the Windows platform is bundled in the driver installation package of DVM. For Linux/Unix platforms, the ODBC driver is not included in the driver installation package and a third-party driver manager can be installed.

The performance of ODBC drivers with DVM can be optimized with Connection Pooling and Optimized Fetch. The connection pooling in the Windows platform can be configured through the ODBC Data Source Administrator. For Unix/Linux platforms, connection pooling is managed by the ODBC Driver Manager. When optimized fetch is enabled, rows ahead of the current row are asynchronously extracted before the client application requests them. This data is then returned to the client application in blocks that can be as large as 32 KB. Enabling optimized fetch helps to minimize network traffic and speeds subsequent fetches as the requested data is likely already in a returned block.

Optimized fetch is enabled either by including the RO=YES connection property in a connection string (a connection string can be used with a DSN) or by appending the FOR FETCH ONLY clause to a SELECT statement.

The performance of ODBC connected applications can also be improved by the appropriate usage of catalog functions, retrieve the required data, select functions that optimize performance, and manage connections efficiently. For example, catalog functions are usually expensive, and hence minimize the use of catalog functions and avoid searching patterns in catalog functions can improve performance. Similarly, using bound columns (for example SQLBindCol instead of SQLGetData) and using SQLExtendedFetch instead of SQLFetch can help to improve the performance

Integrated Data Facility (IDF) and DS Client API

Data Virtualization Manager for z/OS also provides an interface to access virtualized data from within more traditional mainframe languages such as COBOL, Natural or PL/I. With Db2

Integrated DRDA Facility (IDF), the traditional mainframe applications can use standard EXEC SQL statements to access legacy data sources like VSAM, IMS, and ADABAS. The DS Client high-level API allows an application running on z/OS to use a call-level interface to communicate with Data Virtualization Manager to process SQL requests and to retrieve result sets.

The performance of Db2 IDF and DS Client APIs is similar for single user address space applications. However, Db2 IDF performed much better than DS Client API for multiuser environments like CICS and IMS. Also, the applications can use the popular API with Db2 IDF compared to specific high-level API with DS Client. Hence, it is recommended to use Db2 IDF to access data sources from traditional mainframe languages when Db2 is available in the customer environment. For those customer environments where Db2 is not available, DS Client APIs can be used.

The performance of DS Client API can be optimized with data buffering and MapReduce features similar to JDBC. The number of active client interface servers to process client requests can also be optimized by tuning the parameter ACIDVCLIENTMIN. The maximum number of MapReduce tasks that starts to process a request from a DS Client application can be specified using ACIMAPREDUCECLIMAX.

Similarly, the default size of the buffer, the maximum and minimum buffer size, and the maximum number of buffers can also be specified with DSCLIENTBUFFERSIZE, DSCLIENTBUFFERSIZEMAX, DSCLIENTBUFFERSIZEMIN, and DSCLIENTBUFFERNUMMAX respectively.

To configure the DS Client API for optimized performance, refer to the DS Client high-level API chapter of IBM documentation.

Query Optimization and Performance

Data Virtualization Manager for z/OS provides default values that are part of the standard installation in the IN00 configuration file that serves as standard level optimization. These defaults are fine-tuned over years of customer experience.

QL Best Practices
Though increasing resources is an easy option, it is worthwhile to invest time into writing efficient SQL statements for the DVM server to consume. The DVM server is an SQL engine working over 35+ different legacy data sources, which don't conform to relational database rules. A best practice is to conform to the most compliant SQL that matches the source data system. This includes not only SQL, but also functions, routines, and data type.

Examples:

Select only the required fields
Understand how a Large and Small table is being JOINED
Avoid scalar, arithmetic process into the SQL’s

Some best practices:

Simple predicate rewrites cause the optimizer to generate significantly different query plans and force a different access path, which result in getting the best out of the SQL
Avoid tablespace scans when your goal is to fetch a much smaller subset of data
Sort only on the columns needed
Minimize the number of times cursors are opened and closed

Designing SQL is an art form. While many different statements result in the same output, there are wild variances in the execution time based on how the query is constructed. Table 3 provides some guidance for creating queries.

Table 3 - Best practices for query design

Using Apache JMeter with the DVM server

The Apache JMeter™ application is open source software, a 100% pure Java application designed to load test functional behavior and measure performance. Customers can use JMeter to measure the performance of the services and queries run on the DVM server. Apache JMeter can be download from the following Url: https://jmeter.apache.org/

Copy the DVM for z/OS client JDBC driver into the lib directory of the Apache JMeter installation directory. The JDBC driver for DVM is available with the driver installation member AVZBIN2. From the mainframe, transfer the driver installation member has.SAVZBIN(AVZBIN2) to the client machine using the File Transfer Protocol (FTP) in binary mode. Rename the file to JDBCdriver.zip and extract its content into the lib directory of the Apache JMeter installation directory.

Add a Thread Group

Example

Name: AVZ1 JDBC Thread Group Number of Threads (users): 1 Ramp-Up Period (in seconds): 1 Loop Count: 1000

Add a JDBC Connection Configuration

Add JDBC Request

Example

Name: JDBC Request Variable Name Bound to Pool: AVZ1 Server Query Type: Select Statement Query: “SELECT * FROM STAFFVS”

Add a Summary Report

The summary report provides the number of samples, average elapsed time, maximum elapsed time, minimum elapsed time, Throughput, and much more information to evaluate the performance of the JDBC query. Now the configuration is completed and ready to start the performance test.

Run a Performance Test

Once the Thread Group, JDBC Connection Configuration, JDBC Request, and Summary Report are configured, a performance test can be run as shown in Figure 16.

Figure 16 - Apache JMETER

Based on the number of samples set, JMeter runs the JDBC query for the number of iterations defined and provide the average, minimum, and maximum elapsed time and throughput for the performance evaluation. The performance evaluation can be carried out further by testing with different query types and the number of connections.

Performance testing using the Command Line Tester (CLT)

You can use the Command Line Tester (CLT) for evaluating the performance of specific queries, understand the number of rows, bytes read, elapsed time, etc. CLT was developed by IBM specifically for DVM for z/OS and works to simplify development, troubleshooting, and diagnosing problem queries. This tool also is leveraged through RESTful API calls.

No limit on the number of rows processed  Reports summarized execution time for data transfer
Isolates performance issues from application requesters
Uses ANSI SQL  Flexible OUTPUT options
Command line driven
Validated through a RESTful API

This is the preferred approach because it is more of a natural measure for the actual query execution and can be requested from IBM Support and downloaded for your use. It is not currently included in the DVM for z/OS packaged software or as part of PTF maintenance releases. CLT is not a host application that runs on the mainframe, but rather a Java-based application that executes from a command prompt on MS-Windows or a terminal on Linux in Figure 17.

Figure 17. CLT output reading a VSAM file with 4,000,000 records

It is encouraged to avoid using the DVM Studio for query performance evaluation. The DVM Studio has limits for the number of rows that can be displayed, as well as the amount of HEAP space available.

Setup

This package comprises the test harness jar files and includes log4j2. The easiest approach is to copy the JDBC driver jar (for example, dv-jdbc-3.1.xxx.jar) from your production area to the same directory as these tester jars.

Basic Usage

Run the tool using Java from a command prompt.

Example:

Windows: java -cp .;* com.rs.jdbc.dv.CommandLineTester Unix/Linux: java -cp .:* com.rs.jdbc.dv.CommandLineTester <options>

Options

Use the --help option to display the usage options.

Example (run a query and display the elapsed time):

java -cp .;* com.rs.jdbc.dv.CommandLineTester –help

java -cp .;* com.rs.jdbc.dv.CommandLineTester --url "jdbc:rs:dv..." --sql "SELECT * FROM VT"

Multi-threaded MapReduce client test using MRCC/MRCN

Use the CommandLineMrcnTester class to run a multi-threaded MapReduce client test using MRCC/MRCN:

Example:

java -cp .;* com.rs.jdbc.dv.CommandLineMrcnTester --url "jdbc:rs:dv://host:1234; MRCC=4; ..." --sql "SELECT ..." --verbose

The URL must include the MapReduceClientCount/MRCC property, and then the tool adds the appropriate MapReduceClientNumber/MRCN setting for each separate connection. The --verbose flag displays the per-thread fetched row count and other details.

[{"Type":"SW","Line of Business":{"code":"LOB10","label":"Data and AI"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SS4NKG","label":"IBM Data Virtualization Manager for z\/OS"},"ARM Category":[{"code":"a8m0z000000cxAOAAY","label":"SQL Engine"}],"Platform":[{"code":"PF035","label":"z\/OS"}],"Version":"All Version(s)"}]

Product Synonym

DVM

Was this topic helpful?

Document Information

More support for:
IBM Data Virtualization Manager for z/OS

Component:
SQL Engine

Software version:
All Version(s)

Operating system(s):
z/OS

Document number:
6452637

Modified date:
01 September 2021

UID

ibm16452637

IBM Support

Tips

Performance Tuning and Query Optimization

White Papers

Abstract

Content

Product Synonym

Was this topic helpful?

Document Information

UID

Share your feedback

Need support?