Overview of IBM watsonx.data Premium

IBM watsonx.data Premium is a hybrid, gen AI data lakehouse designed to power AI and analytics across complex, distributed data environments. It is powered with capabilities to unlock insights from both structured and unstructured data from diverse sources. It supports a variety of workloads and integrates seamlessly with IBM’s AI ecosystem, making it ideal for enterprises aiming to scale AI and analytics initiatives. It caters to the different personas in the data management, analytics and AI lifecycle:

  • Data engineers can use it to store, query, and analyze data.
  • Data scientists can extract insights from the data for informed business decisions
  • Data stewards can ensure all data governance and quality requirements are addressed
  • An AI app developer can use curated and enriched data to build models and AI applications

The following illustration shows the IBM watsonx.data Premium components.

  IBM watsonx.data Premium  components

Platform architecture

The watsonx.data Premium experience is part of the IBM watsonx platform. Within an IBM Cloud account, multiple integrated experiences on the IBM watsonx platform share services and workspaces. An experience provides focused access to the tools for specific tasks. The IBM watsonx platform includes these integrated experiences:

  • watsonx.data Premium , which contains watsonx.data intelligence, IBM watsonx.data integration, watsonx.ai Studio, and Watson Machine Learning for data management, analytics, and AI across distributed data environments.
  • watsonx, which contains the watsonx.ai Studio, Watson Machine Learning, and IBM watsonx.governance services for building and governing AI solutions.
  • Data Fabric, which contains the watsonx.data intelligence service for preparing and sharing high-quality, trusted data products and the watsonx.data integration service for transforming, integrating, and observing data.

watsonx.data Premium

watsonx.data Premium  is purpose built to handle structured and unstructured data at scale, making it a powerful platform for AI. Key capabilities of watsonx.data include,
  • Hybrid deployment options – Software, SaaS, and Customer VPC (Bring your own cloud (BYOC)), and Developer edition
  • AI-ready architecture – with native support for unstructured and structured data processing, open-source vector database integration (Milvus), first class integration with watsonx.ai portfolio, and automated data ingestion and curation capabilities.
  • Multi-engine support – Presto (Java, C++), Spark (Java, C++), Milvus, integration with Db2wh, and Netezza.
  • Data source connectivity – Multiple connectors support for both structured and unstructured data
  • Unified governance and metadata – Integration with IKC and Apache Ranger, end-to-end access control, access control lists (ACL) support for unstructured data.
  • Open and flexible – Support for open-source formats (Iceberg), separation of compute, metadata, and storage, and co-existing open-source and proprietary tools.
  • Optimized for performance and cost – Object storage across hybrid and multi-cloud environments, reduced data duplication and storage costs, and high-performance query engines for large-scale analytics.

Usage resources

Depending on your service plans, you might have a set amount of usage resources per month, or you might be billed for the resources that you consume. When you run tools on watsonx.data Premium , you consume the following types of resources:

Compute usage: When you run jobs, or deployments, your compute resource usage is calculated based on the rate for the runtime environment and its active duration. Compute resources include the appropriate hardware and software that are specific to the workload. Compute usage is measured in capacity unit hours.

Text extraction: When you use text extraction to convert document files into an AI model-friendly JSON file format, you are charged per page.

Shared functionality: watsonx includes the following functionality that is shared between services and experiences for secure and scalable collaboration:

  • Connectivity
  • Administration
  • Storage
  • Workspaces

Connectivity

You can create connections to remote data sources and import connected data. You can configure connections with personal or shared credentials. For a list of supported connectors, see Connectors.

You can share connections with others across the platform in the Platform assets catalog.

Administration

The following administration features provide security and flexibility:

Software and hardware: watsonx.data Premium is fully managed on IBM Cloud. Software updates are automatic. Scaling of usage resources and storage is automatic.

Security, compliance, and isolation: The data security, network security, security standards compliance, and isolation of watsonx.data Premium are managed by IBM Cloud. Data is encrypted at rest and in motion. You can set up extra security and encryption options.

Your work on watsonx.data Premium , including your data and the models that you create, are private to your account. Your data and models will never be accessible or used by IBM or any other person or organization.

Learn more about security and your options:

Services provisioning: You can add services from the IBM Cloud services catalog. You access some services in other experiences. For example, if you add the watsonx.data intelligence service, you must switch to the Data Fabric experience to use it. You can add data source services and create connections to them from watsonx experiences, but you manage data source services from the IBM Cloud console.

User management: You add users and user groups and manage their IBM Cloud account roles and permissions with IBM Cloud Identity and Access Management. You assign roles within each collaborative workspace across the platform.

Storage

Lite plan: When you sign up for a watsonx.data Premium Lite plan, an IBM Cloud Object Storage service instance is automatically provisioned to provide storage for the assets that you create or add to workspaces. Information that is stored in IBM Cloud Object Storage is encrypted and resilient. Each workspace has its own dedicated bucket. See Object storage for workspaces. For information about Lite plan, see Lite plan.

Enterprise plan: When you sign up for a watsonx.data Premium enterprise plan, you must provision your own IBM Cloud Object Storage. For information about Lite plan, see Enterprise plan.

Workspaces

watsonx.data Premium  is organized as a set of collaborative workspaces where you can work with your team or organization. Each workspace has a set of members with roles that provide permissions to perform actions.

Most users work with assets, which are items that are created or added to workspaces by users. Assets can represent data, models, or other types of code or information. Data assets contain metadata that represents data. Assets that you create in tools, such as models, run code to work with data. You can also create assets that contain information about other assets, such as model use cases that contain metadata, history, and reports about models. See Asset types and properties.

In watsonx.data as a Service experience, you can work in the following types of workspaces:

  • Projects
  • Platform connections

You can search for assets across all workspaces that you belong to.

Projects are where your data engineers, data scientists, data stewards, and AI app developer teams work with data to create data assets, unstructured data flows, submit Spark applications, and work with prompts to get insights into the data.

Your projects are shared across the integrated experiences. However, you can view and run only those assets that are valid in the current experience. The following image shows what the Overview page of a project might look like.

Overview page for a project

Platform connections is a view of the Platform assets catalog that lists connection assets. You can access platform connections in any project or deployment space. The Platform assets catalog is shared across integrated experiences. The following image shows what the Connections page of the Platform connections might look like.

Connections page for Platform connections