wx-data commands and usage
The wx-data command further has different commands within, using which
you can perform various operations specific to watsonx.data. This topic lists the
commands with brief description of the tasks that can be performed.
watsonx.data on IBM Software Hub
watsonx.data on IBM Cloud®
The wx-data command perform operations such as, ingesting data, managing
engines, storage and data sources in watsonx.data.
./cpdctl wx-data [command] [options]wx-data command supports the following commands:
How to use wx-data command --help
(-h)
- To list all the commands in the
wx-dataplugin:./cpdctl wx-data --help - To get details of all options and its descriptions for a specific command in
wx-dataplugin:./cpdctl wx-data [command] --helpFor example:
./cpdctl wx-data ingestion -h NAME: ingestion - Commands for Ingestion resource. USAGE: cpdctl wx-data ingestion [action] COMMANDS: list List ingestion jobs. create Create an ingestion job. get Get ingestion job details. GLOBAL OPTIONS: --cpd-config string Configuration file path --cpdconfig string [Deprecated] Use --cpd-config instead -h, --help Show help --profile string Name of the configuration profile to use -q, --quiet Suppresses verbose messages. --raw-output If set to true, single values in JSON output mode are not surrounded by quotes Use "cpdctl wx-data ingestion service-command --help" for more information about a command. - To get the details of all available options and arguments in the
wx-datacommands to execute an operation:./cpdctl wx-data [command] [options] --help - To use the
wx-dataplugin to execute an operation:./cpdctl wx-data [command] [options]
ingestion
The ingestion command is used for executing different ingestion operations in
watsonx.data.
./cpdctl wx-data ingestion [options]ingestion command further supports the following commands:| Options | Description |
|---|---|
./cpdctl wx-data ingestion list |
Lists the ingestion jobs executed in watsonx.data instance. |
./cpdctl wx-data ingestion create |
Create an ingestion job in watsonx.data instance. |
./cpdctl wx-data ingestion get |
Get the details of an ingestion job executed in watsonx.data instance. |
engine
The engine command is used for executing different engine related operations in
watsonx.data.
./cpdctl wx-data engine [options]The engine command supports the following commands:
| Options | Description |
|---|---|
./cpdctl wx-data engine list |
Lists all the engines available in watsonx.data instance. |
./cpdctl wx-data engine create |
Create or register an engine in watsonx.data instance. |
./cpdctl wx-data engine delete |
Delete an engine from watsonx.data instance. |
./cpdctl wx-data engine attach |
Associate catalogs to a Presto engine in watsonx.data instance. |
./cpdctl wx-data engine detach |
Disassociate the catalogs associated with a Presto engine in watsonx.data instance. |
create and delete commands are used only to create and delete a
Presto (Java) engine.create and delete commands are used to create all available
engines in watsonx.data.bucket
The bucket command is used for executing different storage related operations in
watsonx.data.
./cpdctl wx-data bucket [options]The bucket command supports the following commands:
| Options | Description |
|---|---|
./cpdctl wx-data bucket list |
Lists all the storages available in watsonx.data instance. |
./cpdctl wx-data bucket create |
Register a storage in watsonx.data
instance. Use of secrets from an external vault (HashiCorp) is enabled with create
option. Custom S3 storage creation is supported from CPDCTL version v1.8.219 and later. |
./cpdctl wx-data bucket get |
Get the details of a registered storage in watsonx.data instance. |
./cpdctl wx-data bucket delete |
Delete a storage from watsonx.data instance. |
./cpdctl wx-data bucket activate |
Activate a storage bucket in watsonx.data on IBM Cloud instance only. |
./cpdctl wx-data bucket list-objects |
List all objects in the bucket. |
./cpdctl wx-data bucket upload |
Upload a file from local filesystem to a watsonx.data object storage bucket. |
- When using the
list-objectscommand, buckets with a large number of objects might not list all objects because of API timeouts. - When using the
--paginatedparameter with thelist-objectscommand, only top-level objects are listed. Nested objects are not expanded by default. - Listing the objects using
list-objectsis not supported in ADLS and GCS buckets currently.
database
The database command is used for executing different data source related
operations in watsonx.data.
./cpdctl wx-data database [options]The database command supports the following commands:
| Options | Description |
|---|---|
./cpdctl wx-data database list |
Lists all the data sources available in watsonx.data instance. |
./cpdctl wx-data database create |
Create or add a data source in watsonx.data instance. Use of secrets from an
external vault (HashiCorp) is enabled with create option. |
./cpdctl wx-data database get |
Get the details of a registered data source in watsonx.data instance. |
./cpdctl wx-data database delete |
Delete a data source from watsonx.data instance. |
sparkjob
The sparkjob command is used for executing different Spark related operations
such as submitting a Spark application, listing all applications, and getting the status of a Spark
application in watsonx.data.
./cpdctl wx-data sparkjob [options]The sparkjob command supports the following commands:
| Options | Description |
|---|---|
./cpdctl wx-data sparkjob list |
List all applications available in a Spark engine. |
./cpdctl wx-data sparkjob create |
Submit a Spark application. |
./cpdctl wx-data sparkjob get |
Get the status of a Spark application. |
cpdctl in watsonx.data on IBM
Software Hub, see Submitting
Spark application by using IBM cpdctl. tablemaint
tablemaint command is used for executing different Iceberg table maintenance
operations in watsonx.data../cpdctl wx-data tablemaint [options]The tablemaint command supports the following commands:
| Options | Description |
|---|---|
./cpdctl wx-data tablemaint rollback-to-snapshot |
Roll back, or restore the table to a specific snapshot ID. |
./cpdctl wx-data tablemaint rollback-to-timestamp |
Roll back a table to the snapshot at a specific timestamp. |
./cpdctl wx-data tablemaint set-current-snapshot |
Sets the current snapshot ID for a table. |
./cpdctl wx-data tablemaint cherrypick-snapshot |
Cherry-picks changes from a snapshot into the current table state. Cherry-picking creates a new snapshot from an existing snapshot without altering or removing the original. |
./cpdctl wx-data tablemaint expire-snapshot |
Remove older snapshots and their files which are no longer needed. |
./cpdctl wx-data tablemaint remove-orphan |
Remove files that are not referenced in any metadata files of an Iceberg table and can thus be considered "orphaned". |
./cpdctl wx-data tablemaint rewrite-data |
Rewrites the data files. |
./cpdctl wx-data tablemaint rewrite-manifests |
Rewrite manifests for a table to optimize scan planning. |
./cpdctl wx-data tablemaint register-table |
Creates a table. |
- Force : If the value is set to TRUE, the SQL query that you are going to run will not be printed.
- Debug : If the value is set to TRUE, a copy of the Spark application file is stored to your computer.
cpdctl in watsonx.data on IBM
Software Hub, see Spark table
maintenance by using IBM cpdctl.service
The service command is used for executing different serviceability related
operations in watsonx.data.
./cpdctl wx-data service [options]The service command supports the following commands:
| Options | Description |
|---|---|
./cpdctl wx-data service list-tables |
Lists all table names of hive or iceberg connectors in watsonx.data instance. |
./cpdctl wx-data service get-qhmm-config |
Get the qhmm enabled bucket name in watsonx.data instance. |
./cpdctl wx-data service monitor |
To run stats and qhmm related queries in watsonx.data instance. |
./cpdctl wx-data service generate-engine-dump |
Generate heap or thread dump specific to Presto worker or coordinator watsonx.data instance. |
component
The component command is used for getting the configurations of various
components in watsonx.data.
./cpdctl wx-data component [options]The component command supports the following commands:
| Options | Description |
|---|---|
./cpdctl wx-data component get-mds-status |
Get configuration for Metadata Service (MDS) in watsonx.data instance. |
./cpdctl wx-data component get-ces-status |
Get CES status in watsonx.data instance. |
./cpdctl wx-data component get-cas-cpg-endpoint |
Get CPG and CAS endpoints in watsonx.data instance. |
./cpdctl wx-data component get-hms-status |
List all HMS meta stores in watsonx.data. |
./cpdctl wx-data component get-console-status |
Check console status of watsonx.data instance. |
access-control
The access-control command is used for managing access policies for resources
from watsonx.data version 2.2.2 and CPDCTL
version 1.8.33.
./cpdctl wx-data access-control [options]The access-control command supports the following commands:
| Options | Description |
|---|---|
./cpdctl wx-data access-control list-users-groups |
Get users and groups who have access to watsonx.data instance. |
./cpdctl wx-data access-control list-access |
List resource access policies. |
./cpdctl wx-data access-control update-access |
Update resource access policies. |
./cpdctl wx-data access-control revoke-access |
Revoke resource access policies. |
revoke-access command will be supported only from watsonx.data 2.3.0 release