Table of contents

Data Virtualization on Cloud Pak for Data

Version 1.5.0


Data Virtualization integrates data sources across multiple types and locations and turns all this data into one logical data view. This virtual data lake makes the job of getting value out of your data easy.


After creating connections to your data sources, you can quickly view across your organization’s data. This virtual data platform enables real-time analytics without moving data, duplication, ETLs, or additional storage requirements, so processing times are greatly accelerated. You can bring real-time insightful results to decision-making applications or analysts more quickly and dependably than methods that don’t use virtualization.


Centralized authentication and authorization are enforced for platform users to access data sources in a trusted environment. The Data Virtualization Admin, Data Virtualization Engineer, Data Virtualization Steward, and Data Virtualization User roles provide granular access management to the virtualized assets. Cloud Pak for Data users that need to use Data Virtualization functions must be assigned specific Data Virtualization roles based on their job description.

All communication between the environment and the application is securely encrypted with robust IBM technology, and SSL/TLS encryption by using standard protocols.

Platform support

Data Virtualization supports queries by using standard SQL through common interfaces including R, Spark, Python, and Jupyter Notebooks. In addition, queries are also supported by the most common analytics application tools, including IBM Watson Studio and Cognos Analytics.