White Papers
Abstract
This white paper is meant to capture information related to customer environment and workloads as it relates to processing, storage and memory impacts to a customer environment as part of running the Data Virtualization Manager for z/OS.
Content
The program delivers a virtualization architecture designed for accessing both z/OS and non-z/OS data sources, performing transformative operations that translate hierarchical data formats into a relational format for SQL-based and modern applications to access. The program generates Virtual Tables and Virtual Views that logically map to files, segments, and data records residing on a IBM Z LPAR environment or distributed database servers.
Business Drivers
Foremost, this data virtualization service brings great value for generating shared access to difficult to reach and work with data residing on IBM Z. DVM for z/OS dramatically reduces the work associated with ETL operations, by allowing direct real-time access to persisted data. This data type represents the Source of Record for many critical business applications. Data Virtualization Manager for z/OS (DVM) supports both z/OS and non-z/OS data sources and has the ability to deliver scalability, reliability, and extreme performance through its unique machine-specific optimization. DVM also supports READ and WRITE SQL operations in parallel by using MapReduce. DVM delivers zIIP-eligibility, which works to offload MIPS from General Processors that in turn reduce the cost of running production applications. DVM lowers TCO, provides high ROI, and fast TTM, which are critical for driving business agility.
Service level agreements or objectives also influence infrastructure, topology, and overall design decisions for data virtualization needs. In many instances, downtime is an inhibitor to introducing new technology to an existing technology stack.
One of the strengths of this solution is its’ simple and flexible deployment of the solution, which resides resident to the IBM Z. The program has High Availability capabilities that ensure resiliency and improved fault tolerance when the network, software, and hardware failures, as well, as the ability to update or refresh software functionality by allowing for zero downtime for software refreshes or fixes.
Questions
- What are the primary pain points for the business when it comes to accessing z/OS data?
- Is an outage acceptable for the initial setup?
- What are business requirements for application response times?
- What are business requirements around resiliency or availability of the solution if a given LPAR ceases to function?
- What are business requirements around User, Group, and Role level security?
Virtualization Topology
The designed solution offers some strengths that allow for added simplicity and flexibility across your IBM-driven solutions for hybrid cloud infrastructure. The solution delivers a Z-resident installation that is straightforward with setup and installation in the order of hours and not days. DVM is Z-resident IBM software required for installation to a designated LPAR. Certain configurations needing High Availability require multiple installations and corresponding licenses to support failover and failback when an outage or disaster occurs.
The solution has the flexibility to run multiple DVM Servers concurrently to aid in overall performance. Accordingly, file system requirements need to be sized accordingly to support a range of DVM servers, based on transactions/hour and amount of user concurrency.
*If you currently have existing DVM architecture in place or planned and able to share, we can consider this for current and future design, as well as be in a better position to meet your business requirements.
Distance between the DVM Server(s) and data sources
We are looking for other aspects of interconnectivity, security, site-level HA requirements, and specific characteristics of existing and planned IBM systems that would be potentially participating in data virtualization.
Questions
- What is the distance between IBM Z in a Sysplex participating in data virtualization?
- If there are non-Local data sources, what is the distance between the DVM server and the target database server?
- What is the distance between database servers planned to support HA for the data virtualization environment?
- What is the current cross-site network bandwidth between these servers today and is that shared with multiple applications or dedicated?
(insufficient network bandwidth can lead to an increase in latency, increasing both response and execution time)
- What type of network configuration is in place between existing systems targeted for data virtualization and are there specific requirements for security, such as SSL?
Primary use cases
Our data virtualization solution offers support for a range of use cases ranging from shared access, centralized management, and high availability use with disaster recovery support through continuous delivery following our initial release. Use cases continue to surface as the data landscape changes in both the market and customer environment. Following are some of the primary use cases in use today.
- Real-time access to disparate data across z/OS and non-z/OS data sources
- Modernize existing mainframe applications to enable web-based and mobile applications for access
- Centralize access with control to drive governance for a trusted view of all enterprise data
- Low-level data integration enables the copying of data by way of SQL-based operations
- Leverage data virtualization as a methodology for application development/incubation of production-level prototypes
- Establish a single View for purposes of driving operational and business analytics
- Forge new business models by employing Machine Learning algorithms and Artificial Intelligence over shared data\
Questions
- Do you plan to use data virtualization for workload balancing to offload queries to 1 or more sources?
- Do you anticipate using data virtualization as part of your build/test/deployment for new applications through a logical Data Model targeted for development, test, or quality assurance?
- Do you plan to use data virtualization to reduce business interruption or outages due to planned or unplanned maintenance windows? If so, what are your SLA/SLO ranges across multi-tier business applications?
Physical Storage or Memory
To support data virtualization processing, DVM can use more physical storage or physical memory to achieve the most optimal query runtime or write operation. As data transformations, pushdown operations, JOINs, UNIONs, and Functions are executed, more resources can require specific to both “pre” and “post” processing of a normal operation.
Questions
- What is the range of daily transactional volume for virtualized data across systems targeted for replication?
- Are you open to leveraging more zIIP processing to help in achieving your performance goals?
- Would you like to have flexibility around the number of active DVM servers processing read/write operations to drive improved parallelism for generalized workloads beyond local processing occurring independently of the DVM Server?
- Would you like this to be configurable within the user interface upon initial configuration or as part of the management of an active data virtualization environment system while in a production state? Is a managed outage or failover to secondary acceptable to perform this operation?
Environment
Hardware Configuration
Specify the memory requirements as Real Storage assigned to an LPAR in Gigabytes. If the product is installed on more than two LAPRs, specify the largest and smallest configurable memory. Table 1 can be used as a template for capturing your development, test, and production environments.
|
LPAR
|
Model
|
No. of GPPs | No. of zIIPs |
Memory (GB)
|
Comments
|
|---|---|---|---|---|---|
Table 1. Hardware configuration
z/OS Environment
Specify support software programs targeted for use with data virtualization. Add more configurations by using Table 2.
| Configuration | Y/N | Details |
|---|---|---|
| Security | ||
| ENQ Manager | ||
| VSAM RLS | ||
| Innovation IAM | ||
| CDC for Db2 | ||
| CDC for VSAM | ||
| z/OS Connect Enterprise Edition |
||
| CICS | ||
| IBM Cloud Pak for Data | ||
| Other(s) |
Table 2. z/OS environment information
Access to Data Sources
Client connections to DVM for z/OS
Specify applications or client tools accessing the DVM server for SQL-based read and write operations by using Table 3.
Specify applications or client tools accessing the DVM server for SQL-based read and write operations by using Table 3.
| Product | Product Version | OS | OS Version | REST | JDBC | ODBC |
|---|---|---|---|---|---|---|
| IBM Data Stage | ||||||
| IBM Query Management Facility | ||||||
| Informatica PowerCenter | ||||||
| Microsoft Excel | ||||||
| Microsoft Power BI | ||||||
| Tableau | ||||||
| Other(s) |
Table 3. Client connection information
Data Sources
Specify the data sources that need to be virtualized and available to the mainframe, Java, ETL, analytic, and reporting tools by using Table 4.
| Product / Version | Version | Platform | Data Source | Workload | Platform |
|---|---|---|---|---|---|
| Adabas | Oracle Cloud | ||||
| Db2 distributed | Netezza | ||||
| Db2 for z/OS | Postgres | ||||
| Hadoop (Apache, HDP, CDH, GP) | SQL Server | ||||
| IDMS | Azure | ||||
| IMS | AWS Redshift | ||||
| MongoDB | Sequential Files | ||||
| MQ Series | SMF | ||||
| MySQL | SYSLOG | ||||
| Oracle | Teradata | ||||
| Oracle Exadata | VSAM or CICS/VSAM | ||||
Table 4. Data sources
Application Workloads
Workloads are usually separated into online workloads or batch workloads. For online workload, you can further divide it by line of business. Active/Active workload has more strict definitions. The active workload is a business-related definition. It is the aggregation of these items:
- Software is a user-written application and the middleware runtime environment
- Data is a set of related objects that have transactional consistency maintained, and optionally, referential integrity constraints preserved
- Network connectivity is one or more TCP/IP addresses or hostnames and ports (for example, 10.10.10.1:80)
This definition is intended to preserve the transaction consistency of the data. Data Virtualization supports all CRUD operations supported by the underlying data sources. Support for underlying data sources includes data type mapping, function mapping, and optimized transformations into Virtual Table and Virtual Views. The DVM solution receives client requests and performs costing calculations and parsing operations, to best optimize the round-trip response and execution time for a workload.
Applications can be select subsets of data or a complete data set through a subset of Virtual Tables or Virtual Views within or across schema. Processed workloads can have multiple query plans that include transform, pushdown, and JOIN operations in particular frequency and volume daily. These applications are critical to business operations and this solution works to optimize query execution and response time for requesting applications.
Questions
- What types of application workloads are active in your environment (batch, transactional, ISPF, and so on)?
- Do you have a maintenance window for deploying application upgrades to the production system during which you take the server offline?
- Do you performDDLoperationsfor specific workloads or have write-intensive applications?
- How frequently do you delete data (and how much)? Truncate? Entire table? Subset with a where clause? How do you typically load data or perform bulk operations? (for example, external table)
- Do you leverage indexes or primary keys for your data models?
Complete the table for the best representation of critical workload types. A sample entry is provided in the first row of Table 5.
|
Workload
Name
|
Volume of Data
|
Transaction Rate
|
Workload Characteristics
|
Response Objectives | Execution Objectives |
Description
|
|---|---|---|---|---|---|---|
| Workload1 {sample} |
Daily volume:
MB/TB % Annual Growth
|
24-hr period
8-hr workday
Batch Load
|
|
|
|
|
| Workload2 | ||||||
| Workload3 | ||||||
| Workload4 | ||||||
| Other(s) |
Table 5. Workload information
[{"Type":"SW","Line of Business":{"code":"LOB10","label":"Data and AI"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SS4NKG","label":"IBM Data Virtualization Manager for z\/OS"},"ARM Category":[{"code":"a8m0z000000cxAOAAY","label":"SQL Engine"}],"Platform":[{"code":"PF035","label":"z\/OS"}],"Version":"All Version(s)"}]
Product Synonym
DVM
Was this topic helpful?
Document Information
Modified date:
15 July 2022
UID
ibm16447764