Frequently asked questions
Find answers to frequently asked questions about IBM® watsonx.data as a Service on Amazon Web Services (AWS).
General
- What is IBM watsonx.data?
-
IBM watsonx.data is an open, hybrid, and governed fit-for-purpose data store optimized to scale all data, analytics, and AI workloads to get greater value from your analytics ecosystem. It is a data management solution for collecting, storing, querying, and analyzing all your enterprise data (structured, semi-structured, and unstructured) with a single unified data platform. It provides a flexible and reliable platform that is optimized to work on open data formats.
- What can I do with IBM watsonx.data?
-
You can use IBM watsonx.data to collect, store, query, and analyze all your enterprise data with a single unified data platform. You can connect to data in multiple locations and get started in minutes with built-in governance, security, and automation. You can use multiple query engines to run analytics and AI workloads, reducing your data warehouse costs by up to 50%.
- What are the key features of IBM watsonx.data?
-
The key features of IBM watsonx.data are:
- An architecture that fully separates compute, metadata, and storage to offer ultimate flexibility.
- Multiple engines such as Presto and Spark that provide fast, reliable, and efficient processing of big data at scale.
- Open formats for analytic data sets, allowing different engines to access and share the data at the same time.
- Data sharing between IBM watsonx.data, Db2 Warehouse, and Netezza Performance Server or any other data management solution through common Iceberg table format support, connectors, and a shareable metadata store.
- Built-in governance that is compatible with existing solutions, including IBM Knowledge Catalog.
- Cost-effective, simple object storage is available across hybrid-cloud and multicloud environments.
- Integration with a robust ecosystem of IBM’s best-in-class solutions and third-party services to enable easy development and deployment of key use cases.
- Which data formats are supported in IBM watsonx.data?
-
The following data formats are supported in IBM watsonx.data:
- Ingestion: Data ingestion in IBM
watsonx.data
supports
.CSVand.Parquetdata file formats. - Create Table: Create table in IBM
watsonx.data
supports
.CSV,.Parquet,.JSON, and.TXTdata file formats.
- Ingestion: Data ingestion in IBM
watsonx.data
supports
- What is the maximum size of the default IBM managed bucket?
-
The IBM-managed bucket is a default 10 GB bucket.
Presto
- What is Presto?
-
Presto is a distributed SQL query engine, with the capability to query vast data sets located in different data sources, thus solving data problems at scale.
- What are the Presto server types?
-
A Presto installation includes three server types: Coordinator, Worker, and Resource manager.
- What SQL statements are supported in IBM watsonx.data?
-
For information on supported SQL statements, see Supported SQL statements.
Metastore
- What is Hive Metastore (HMS)?
-
Hive Metastore (HMS) is a service that stores metadata that is related to Presto and other services in a backend Relational Database Management System (RDBMS) or Hadoop Distributed File System (HDFS).
Setup
- How can I configure an engine?
-
From the IBM watsonx.data web console, go to Infrastructure manager to configure an engine. For more information, see Provisioning a Presto engine.
- How can I configure catalog or metastore?
-
To configure a catalog with an engine, see Associating a catalog with an engine.
- How can I configure a bucket?
-
From the IBM watsonx.data web console, go to Infrastructure manager to configure a bucket. For more information, see Adding a bucket-catalog pair.
Access
- How can I manage IAM access for IBM watsonx.data?
-
Controlling access to the engines and other components is a critical requirement for many enterprises. To ensure that the resource usage is under control, IBM watsonx.data provides the ability to manage access controls on these resources. A user with admin privileges on the resources can grant access to other users.Note: To add users, delete users or manage access of users to a SaaS account, see the Managing SaaS accounts section of Getting started with the IBM SaaS Console with accounts.
Presto Engine
- How can I create an engine?
-
To create an engine, see Provisioning a Presto engine.
- How can I delete an engine?
-
To delete an engine, see Deleting an engine.
- How can I run SQL queries?
-
You can use the Query workspace interface in IBM watsonx.data to run SQL queries and scripts against your data. For more information, see Running SQL queries.
Databases and Connectors
- How can I add a database?
-
To add a database, see Adding a database-catalog pair.
- How can I remove a database?
-
To remove a database, see Deleting a database-catalog pair.
- What data sources does IBM watsonx.data currently support?
-
IBM watsonx.data currently supports the following data sources:
- IBM Db2
- IBM Netezza
- Apache Kafka
- MongoDB
- MySQL
- PostgreSQL
- SQL Server
- Custom
- Teradata
- SAP HANA
- Elasticsearch
- SingleStore
- Snowflake
- IBM Data Virtualization Manager for z/OS
- How can I load the data into IBM watsonx.data?
-
You can load the data into IBM watsonx.data by the following ways:
- Web console: You can use the Ingestion jobs tab from the Data manager page to securely and easily load data into IBM watsonx.dataconsole. For more information, see Ingesting data by using web console.
- Command Line Interface: You can load data into IBM watsonx.data through CLI. For more information, see Ingesting data by using command line interface (CLI) .
- Creating tables: You can load or ingest local data files to create tables by using the
CREATE TABLEoption. For more information, see Creating tables.
- How can I create tables?
-
You can create tables through the Data manager page by using the web console. For more information, see Creating tables.
- How can I create schema?
-
You can create schema through the Data manager page by using the web console. For more information, see Creating schemas.
- How can I query the loaded data?
-
You can use the Query workspace interface in IBM watsonx.data to run SQL queries and scripts against your data. For more information, see Running SQL queries.
Ingestion
- What are the storage bucket options available?
-
The storage bucket options available are IBM Storage Ceph, IBM Cloud Object Storage (COS), AWS S3, and MinIO object storage.
- What type of data files can be ingested?
-
Only
.Parquetand.CSVdata files can be ingested.
- Can a folder of multiple files be ingested together?
-
Yes a folder of multiple data files be ingested. An S3 folder must be created with data files in it for ingesting. The source folder must contain either all Parquet files or all CSV files. For detailed information on S3 folder creation, see Preparing for ingesting data.
- What commands are supported in the command-line interface during ingestion?
-
For commands supported in the command-line interface during ingestion, see Options and parameters supported in ibm-lh tool.
Pricing plan
- What are the available pricing plans for IBM watsonx.data as a Service?
- Enterprise plan is the only available plan for watsonx.data as a Service.
- What is included in the enterprise plan?
- Following are the key features:
- Ability to pause and resume Presto engine.
- Ability to connect to an IBM Cloud-provided Cloud Object Storage (COS) bucket and provide credentials to your own COS or S3 bucket.
- Ability to delete Presto, Milvus, and connections to your own bucket.
- You pay by hour for each infrastructure resource that you add. Starting with support services then build the engines and services that you want. This has an hourly rate that is computed in Resource Units that maps to your payment method whether ‘Pay as You Go’ or ‘Subscription’.
- Presto and external Spark engine and Milvus service.
- Hive metastore and Iceberg catalog.
- Infrastructure manager and query editor.
- Db2 Warehouse and Netezza integration.
- Ability to scale (increase and decrease) node sizes for Presto engines.
- What are the different payment plans under the enterprise plan?
- The different payment plans under the enterprise plan are ‘Subscription’ or ‘Pay as you go’.
- Is the cost for services like Milvus included in the enterprise plan?
- Yes, Milvus service is included in the enterprise plan.