Table of contents

Data Virtualization

Data Virtualization integrates data sources across multiple types and locations and turns it into one logical data view. This virtual data lake makes the job of getting value out of your data easy.

Benefits

By creating connections to your data sources, you can quickly view across your organization's data. This virtual data platform enables real-time analytics without moving data, duplication, ETLs, and additional storage requirements, so processing times are greatly accelerated. This brings real-time insightful results to decision-making applications or analysts more quickly and dependably than existing methods.

After you provision the Data Virtualization add-on, you can manage users, connect to multiple data sources, create and govern virtual assets, then consume the virtualized data.
Connect, Join, Create Views, and Consume are the main actions that are needed for Data Virtualization.

The Virtualized data menu provides quick access to download driver packages and connection information. For more information, see Provisioning Data Virtualization.

Security

Centralized authentication and authorization are enforced for platform users to access data sources in a trusted environment. The Data Virtualization Admin, Data Virtualization Engineer, Data Virtualization Steward, and Data Virtualization User roles provide granular Access Management to the virtualized assets. All communication facilitated within the environment, and back to the application, is securely encrypted with robust IBM technology, and SSL/TLS encryption by using standard protocols.

By default, IBM Cloud Pak for Data user records are stored in its internal repository database or in an external LDAP server. Users who are authenticated by Cloud Pak for Data have access to Data Virtualization. Cloud Pak for Data users that need to use Data Virtualization functions must be assigned specific Data Virtualization roles based on their job description.

For more information about user authentication, authorization, and roles, see Giving users access to Data Virtualization.

Platform Support

Data Virtualization supports queries by using standard SQL through common interfaces by including R, Spark, Python, and Jupyter Notebooks, in addition to most common analytics application tools, including IBM Watson Studio (formerly IBM Data Science Experience) and Cognos Analytics.

For more information, see the Data Virtualization whitepaper.