Project Survey

White Papers

Abstract

This white paper is meant to capture information related to customer environment and workloads as it relates to processing, storage and memory impacts to a customer environment as part of running the Data Virtualization Manager for z/OS.

Content

The program delivers a virtualization architecture designed for accessing both z/OS and non-z/OS data sources, performing transformative operations that translate hierarchical data formats into a relational format for SQL-based and modern applications to access. The program generates Virtual Tables and Virtual Views that logically map to files, segments, and data records residing on a IBM Z LPAR environment or distributed database servers.

Business Drivers

Foremost, this data virtualization service brings great value for generating shared access to difficult to reach and work with data residing on IBM Z. DVM for z/OS dramatically reduces the work associated with ETL operations, by allowing direct real-time access to persisted data. This data type represents the Source of Record for many critical business applications. Data Virtualization Manager for z/OS (DVM) supports both z/OS and non-z/OS data sources and has the ability to deliver scalability, reliability, and extreme performance through its unique machine-specific optimization. DVM also supports READ and WRITE SQL operations in parallel by using MapReduce. DVM delivers zIIP-eligibility, which works to offload MIPS from General Processors that in turn reduce the cost of running production applications. DVM lowers TCO, provides high ROI, and fast TTM, which are critical for driving business agility.

Service level agreements or objectives also influence infrastructure, topology, and overall design decisions for data virtualization needs. In many instances, downtime is an inhibitor to introducing new technology to an existing technology stack.

One of the strengths of this solution is its’ simple and flexible deployment of the solution, which resides resident to the IBM Z. The program has High Availability capabilities that ensure resiliency and improved fault tolerance when the network, software, and hardware failures, as well, as the ability to update or refresh software functionality by allowing for zero downtime for software refreshes or fixes.

Questions

What are the primary pain points for the business when it comes to accessing z/OS data?
Is an outage acceptable for the initial setup?
What are business requirements for application response times?
What are business requirements around resiliency or availability of the solution if a given LPAR ceases to function?
What are business requirements around User, Group, and Role level security?

Virtualization Topology

The designed solution offers some strengths that allow for added simplicity and flexibility across your IBM-driven solutions for hybrid cloud infrastructure. The solution delivers a Z-resident installation that is straightforward with setup and installation in the order of hours and not days. DVM is Z-resident IBM software required for installation to a designated LPAR. Certain configurations needing High Availability require multiple installations and corresponding licenses to support failover and failback when an outage or disaster occurs.

The solution has the flexibility to run multiple DVM Servers concurrently to aid in overall performance. Accordingly, file system requirements need to be sized accordingly to support a range of DVM servers, based on transactions/hour and amount of user concurrency.

*If you currently have existing DVM architecture in place or planned and able to share, we can consider this for current and future design, as well as be in a better position to meet your business requirements.

Distance between the DVM Server(s) and data sources

We are looking for other aspects of interconnectivity, security, site-level HA requirements, and specific characteristics of existing and planned IBM systems that would be potentially participating in data virtualization.

Questions

What is the distance between IBM Z in a Sysplex participating in data virtualization?
If there are non-Local data sources, what is the distance between the DVM server and the target database server?
What is the distance between database servers planned to support HA for the data virtualization environment?
What is the current cross-site network bandwidth between these servers today and is that shared with multiple applications or dedicated?
(insufficient network bandwidth can lead to an increase in latency, increasing both response and execution time)
What type of network configuration is in place between existing systems targeted for data virtualization and are there specific requirements for security, such as SSL?

Primary use cases

Our data virtualization solution offers support for a range of use cases ranging from shared access, centralized management, and high availability use with disaster recovery support through continuous delivery following our initial release. Use cases continue to surface as the data landscape changes in both the market and customer environment. Following are some of the primary use cases in use today.

Real-time access to disparate data across z/OS and non-z/OS data sources
Modernize existing mainframe applications to enable web-based and mobile applications for access
Centralize access with control to drive governance for a trusted view of all enterprise data
Low-level data integration enables the copying of data by way of SQL-based operations
Leverage data virtualization as a methodology for application development/incubation of production-level prototypes
Establish a single View for purposes of driving operational and business analytics
Forge new business models by employing Machine Learning algorithms and Artificial Intelligence over shared data\

Questions

Do you plan to use data virtualization for workload balancing to offload queries to 1 or more sources?
Do you anticipate using data virtualization as part of your build/test/deployment for new applications through a logical Data Model targeted for development, test, or quality assurance?
Do you plan to use data virtualization to reduce business interruption or outages due to planned or unplanned maintenance windows? If so, what are your SLA/SLO ranges across multi-tier business applications?

Physical Storage or Memory

To support data virtualization processing, DVM can use more physical storage or physical memory to achieve the most optimal query runtime or write operation. As data transformations, pushdown operations, JOINs, UNIONs, and Functions are executed, more resources can require specific to both “pre” and “post” processing of a normal operation.

Questions

What is the range of daily transactional volume for virtualized data across systems targeted for replication?
Are you open to leveraging more zIIP processing to help in achieving your performance goals?
Would you like to have flexibility around the number of active DVM servers processing read/write operations to drive improved parallelism for generalized workloads beyond local processing occurring independently of the DVM Server?
Would you like this to be configurable within the user interface upon initial configuration or as part of the management of an active data virtualization environment system while in a production state? Is a managed outage or failover to secondary acceptable to perform this operation?

Environment

Hardware Configuration

Specify the memory requirements as Real Storage assigned to an LPAR in Gigabytes. If the product is installed on more than two LAPRs, specify the largest and smallest configurable memory. Table 1 can be used as a template for capturing your development, test, and production environments.

LPAR	Model	No. of GPPs	No. of zIIPs	Memory (GB)	Comments

Table 1. Hardware configuration

z/OS Environment

Specify support software programs targeted for use with data virtualization. Add more configurations by using Table 2.

Configuration	Y/N	Details
Security
ENQ Manager
VSAM RLS
Innovation IAM
CDC for Db2
CDC for VSAM
z/OS Connect Enterprise Edition
CICS
IBM Cloud Pak for Data
Other(s)

Table 2. z/OS environment information

Access to Data Sources

Client connections to DVM for z/OS
Specify applications or client tools accessing the DVM server for SQL-based read and write operations by using Table 3.

Product	Product Version	OS	OS Version	REST	JDBC	ODBC
IBM Data Stage
IBM Query Management Facility
Informatica PowerCenter
Microsoft Excel
Microsoft Power BI
Tableau
Other(s)

Table 3. Client connection information

Data Sources

Specify the data sources that need to be virtualized and available to the mainframe, Java, ETL, analytic, and reporting tools by using Table 4.

Product / Version	Version	Platform	Data Source	Workload	Platform
Adabas			Oracle Cloud
Db2 distributed			Netezza
Db2 for z/OS			Postgres
Hadoop (Apache, HDP, CDH, GP)			SQL Server
IDMS			Azure
IMS			AWS Redshift
MongoDB			Sequential Files
MQ Series			SMF
MySQL			SYSLOG
Oracle			Teradata
Oracle Exadata			VSAM or CICS/VSAM

Table 4. Data sources

Application Workloads

Workloads are usually separated into online workloads or batch workloads. For online workload, you can further divide it by line of business. Active/Active workload has more strict definitions. The active workload is a business-related definition. It is the aggregation of these items:

Software is a user-written application and the middleware runtime environment
Data is a set of related objects that have transactional consistency maintained, and optionally, referential integrity constraints preserved
Network connectivity is one or more TCP/IP addresses or hostnames and ports (for example, 10.10.10.1:80)

This definition is intended to preserve the transaction consistency of the data. Data Virtualization supports all CRUD operations supported by the underlying data sources. Support for underlying data sources includes data type mapping, function mapping, and optimized transformations into Virtual Table and Virtual Views. The DVM solution receives client requests and performs costing calculations and parsing operations, to best optimize the round-trip response and execution time for a workload.

Applications can be select subsets of data or a complete data set through a subset of Virtual Tables or Virtual Views within or across schema. Processed workloads can have multiple query plans that include transform, pushdown, and JOIN operations in particular frequency and volume daily. These applications are critical to business operations and this solution works to optimize query execution and response time for requesting applications.

Questions

What types of application workloads are active in your environment (batch, transactional, ISPF, and so on)?
Do you have a maintenance window for deploying application upgrades to the production system during which you take the server offline?
Do you performDDLoperationsfor specific workloads or have write-intensive applications?
How frequently do you delete data (and how much)? Truncate? Entire table? Subset with a where clause? How do you typically load data or perform bulk operations? (for example, external table)
Do you leverage indexes or primary keys for your data models?

Complete the table for the best representation of critical workload types. A sample entry is provided in the first row of Table 5.

Workload Name	Volume of Data	Transaction Rate	Workload Characteristics	Response Objectives	Execution Objectives	Description
Workload1 {sample}	Daily volume: MB/TB % Annual Growth	24-hr period 8-hr workday Batch Load Frequency Timeframe	Avg. TXN size Total # of Tables % Inserts % updates % deletes	Seconds Minutes Hours Days	Milliseconds Seconds Minutes Hours	Describe workload Concurrency Type of queries Columns per Table
Workload2
Workload3
Workload4
Other(s)

Table 5. Workload information

[{"Type":"SW","Line of Business":{"code":"LOB10","label":"Data and AI"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SS4NKG","label":"IBM Data Virtualization Manager for z\/OS"},"ARM Category":[{"code":"a8m0z000000cxAOAAY","label":"SQL Engine"}],"Platform":[{"code":"PF035","label":"z\/OS"}],"Version":"All Version(s)"}]

Product Synonym

DVM

Was this topic helpful?

Document Information

Modified date:
15 July 2022

UID

ibm16447764

Tips

Project Survey

White Papers

Abstract

Content

Product Synonym

Was this topic helpful?

Document Information

UID

Share your feedback

Need support?