What's new and changed in Data Virtualization
Data Virtualization updates can include new features, bug fixes, and security updates. Updates are listed in reverse chronological order so that the latest release is at the beginning of the topic.
You can see a list of the new features for the platform and all of the services at What's new in IBM® Cloud Pak for Data.
Installing or upgrading Data Virtualization
Ready to install or upgrade Data Virtualization?
- Related documentation:
Cloud Pak for Data Version 4.5.3
A new version of Data Virtualization was released in October 2022 with Cloud Pak for Data 4.5.3.
Operand version: 1.8.3
This release includes the following changes:
- New features
-
The 1.8.3 release of Data Virtualization includes the following features and updates:
- Sharing your virtualized objects is quicker and easier
-
When you virtualize objects, you can assign the objects to multiple projects or data requests, and you can publish the objects to a catalog, all in one step.
- A Data Virtualization connection is now available in the Platform assets catalog by default
-
You can add a Data Virtualization connection to projects without manually populating the connection details.
- Customer-reported issues fixed in this release
- Security issues fixed in this release
-
This release includes fixes for the following security issues:
CVE-2022-1012, CVE-2022-1586, CVE-2022-1652, CVE-2022-1705, CVE-2022-1729, CVE-2022-1962, CVE-2022-1976, CVE-2022-2257, CVE-2022-2380, CVE-2022-21540, CVE-2022-21541, CVE-2022-23648, CVE-2022-23772, CVE-2022-23773, CVE-2022-23806, CVE-2022-24675, CVE-2022-24921, CVE-2022-25313, CVE-2022-25314, CVE-2022-25375, CVE-2022-28131, CVE-2022-28327, CVE-2022-28356, CVE-2022-29154, CVE-2022-29217, CVE-2022-30580, CVE-2022-30629, CVE-2022-30630, CVE-2022-30631, CVE-2022-30632, CVE-2022-30633, CVE-2022-30635, CVE-2022-31030, CVE-2022-32148, CVE-2022-32250, CVE-2022-33068, CVE-2022-33099, CVE-2022-33980, CVE-2022-34169, CVE-2022-34494
CVE-2021-3408, CVE-2021-3583, CVE-2021-3701, CVE-2021-3702, CVE-2021-4041, CVE-2021-29923, CVE-2021-30465, CVE-2021-33098, CVE-2021-33150, CVE-2021-39711, CVE-2021-39715, CVE-2021-40528, CVE-2021-41092, CVE-2021-46822
CVE-2020-1734, CVE-2020-1737, CVE-2020-12401
CVE-2019-16884
CVE-2016-9962
Cloud Pak for Data Version 4.5.1
A new version of Data Virtualization was released in July 2022 with Cloud Pak for Data 4.5.1.
Operand version: 1.8.1
This release includes the following changes.
- New features
-
The 1.8.1 release of Data Virtualization includes the following features and updates.
- Use Cognos Authentication Method (CAM) credentials to connect to Planning Analytics data sources
- You can now use CAM credentials as an authentication method when you create a connection to a Planning Analytics data source in Data Virtualization. For more information, see IBM Planning Analytics connection.
- Use Watson™ Knowledge Catalog policies to filter rows in virtualized tables
-
You might have a data source that has tables with government, enterprise, and retail client data combined. For example, a billing table might have data for all the customers, where some of the rows are for government clients and some are for nongovernment clients. The type of the client is not indicated in the billing table. Now, you can filter the list of client records by using one of the following techniques.
- You can use a separate table to identify customers that are government clients. The IDs from this table can be used to filter out rows from the billing table. When you filter out rows, the masked table does not contain the rows with data of government clients.
- You can use a table of blocked customer identifiers as a reference table. Any rows in the billing table that have rows with the customer identifier that is included in the blocked customer set are filtered out of the resulting set.
For more information, see Filtering rows in data protection rules.
- Security fixes
-
This release includes fixes for the following security issues.
CVE-2022-0778, CVE-2022-26691, CVE-2022-28733, CVE-2022-28734, CVE-2022-28735, CVE-2022-28736, CVE-2022-28737, CVE-2022-29244, CVE-2022-31129
CVE-2021-3695, CVE-2021-3696, CVE-2021-3697, CVE-2021-34429
Cloud Pak for Data Version 4.5.0
A new version of Data Virtualization was released in June 2022 with Cloud Pak for Data 4.5.0.
Operand version: 1.8.0
This release includes the following changes.
- New features
-
Version 1.8.0 of the Data Virtualization service includes the following features and updates.
- Upgrade to Cloud Pak for Data version 4.5 (Data Virtualization 1.8.0)
- You can upgrade Data Virtualization from the following Cloud Pak for Data versions to Cloud Pak for Data version 4.5.
- Back up and restore Data Virtualization
- You can use the Cloud Pak for Data backup and restore utilities to take frequent online backups of Data Virtualization without sacrificing productivity. Or you can put Cloud Pak for Data in quiesce mode to consistently back up Data Virtualization while your cluster is offline.
- Quickly find and virtualize tables with the Explore tab
- You can now quickly find the tables that you want to virtualize. On the
Virtualize page, you can use the Explore tab to browse
through databases, schemas, and available tables in a connected data source. The
List tab displays all of the available tables in all of your connected data
sources. On the Data sources page, you can filter your data sources to quickly
load the reduced list of available tables in the List tab.
For more information, see Creating virtual objects in Data Virtualization.
- Improve statistics collection for virtualized tables by using data sampling
-
Data sampling improves statistics collection by reducing the resources that you need to collect statistics. When you collect statistics by selecting the Remote query collection method in the web client, a default sampling rate of 20% is used. To optimize statistics collection, select Enable table sampling and choose a sampling rate between 1% and 99%.
- Virtualize files with column headers in cloud object storage
- You can now virtualize flat files in cloud object storage that contain column headers.
- Manage access for multiple groups if you are an Admin
- As a Data Virtualization Admin, you can
now
grant and revoke access for multiple users, groups, and roles at the same time.
For more information, see Managing access to virtual objects in Data Virtualization.
- Filter rows in virtualized data based on data protection rules in Watson Knowledge Catalog
- Data Virtualization supports masking columns in
virtualized data based on data protection rules that are defined in Watson Knowledge
Catalog. Now, you can create data protection rules
to include or exclude rows in your virtualized data to avoid exposing sensitive data.
For more information, see Governing virtual data with data protection rules in Data Virtualization and Designing data protection rules.
- Improve query performance and enforcement of data protection rules
- Data Virtualization now stores and caches data
protection rules from Watson Knowledge
Catalog in a policy
enforcement point cache to avoid evaluating rules every time an object is queried. This cache
improves the performance of previously executed queries by reducing the number of calls to Watson Knowledge
Catalog to fetch the rules. However, you might
notice a delay of up to 10 minutes before newly added or updated data protection rules are applied
to queries.
For more information, see Enabling enforcement of data protection rules in Data Virtualization.
- Manage metadata for Data Virtualization assets with metadata enrichment
- Metadata enrichment helps you find data faster, trust your data, and protect your data. Metadata
includes terms that define the meaning of the data, rules that document ownership, and quality
standards.
For more information, see Managing metadata enrichment.
- Support for predicate pushdown on more data sources
- Predicate pushdown is an optimization that reduces query times and memory usage. The following
data sources now support pushdown of predicates: MySQL (My SQL Community Edition
and My SQL Enterprise Edition), Cloudera Impala, and Data Virtualization Manager for z/OS®.
The following enhanced pushdown capabilities have also been implemented on more SQL patterns to improve query performance.
- SQL statements with
LIKE
predicates are now pushed down for: Db2®, SAP HANA, Oracle, PostgreSQL, Apache Hive, MySQL, Microsoft SQL Server, Snowflake, Netezza® Performance Server, and Teradata. - SQL statements with
Fetch
clauses are now pushed down for: Db2, Db2 for z/OS, Apache Derby, Oracle, Amazon Redshift, Google BigQuery, and Salesforce.com data sources. - SQL statements with a string comparison filter are now pushed down for: Db2, Microsoft SQL Server, Teradata, Netezza Performance Server, and Apache Derby data sources.
- SQL statements with OLAP functions are now pushed down for: Db2 and Netezza Performance Server data sources.
- SQL statements with
- Customer-reported issues fixed in this release
-
- DT127089: Data Virtualization fails to connect to MS SQL Server with an INSTANCE name
- DT128265: Duplicate Virtualized Asset
- DT129875: dv-extension-translations-job has 2 same label: job_name and job-name
- DT130521: Data Virtualization Showing Incorrect Number of Users on Instance
- DT130572: User Must Have at Least One Prior Sign-In to be Granted Permissions in Data Virtualization
- Security fixes
-
This release includes fixes for the following security issues:
CVE-2022-1154, CVE-2022-21426, CVE-2022-21434, CVE-2022-21443, CVE-2022-21476, CVE-2022-21496, CVE-2022-29078
CVE-2021-3634, CVE-2021-3807, CVE-2021-4189, CVE-2021-25219, CVE-2021-41617, CVE-2021-43138, CVE-2021-43818
CVE-2020-19131, CVE-2020-35492
CVE-2018-25032
- Bug fixes
-
This release includes the following fixes:
- Issue: Persistent volume on Data Virtualization head node becomes full.
Resolution: The persistent volume (PV) on the Data Virtualization head node no longer becomes full because transaction logs in the embedded Db2 database are archived.
- Issue: Minute selector of the cache refresh rate can be incremented beyond maximum and
cannot be reset.
Resolution: To set a cache refresh rate, you can select an Hourly frequency and then choose the minute of the hour when the cache refresh is run. You cannot increase this frequency beyond 59 minutes.
- Issue: You must refresh the SSL certificate that is used by Data Virtualization after the Cloud Pak for Data self-signed certificate is
updated.
Resolution: The certificate manager regenerates a new certificate and re-creates the secret for that certificate. For more information, see Securing the Data Virtualization environment.
- Issue: Persistent volume on Data Virtualization head node becomes full.