Installing and using dbt-watsonx-presto
This section covers the steps to install and use
dbt-watsonx-presto
.
watsonx.data on IBM Software Hub
watsonx.data Developer edition
Procedure
-
Run the following command on your system to install
dbt-watsonx-presto
.pip install dbt-watsonx-presto
- Run the following command to verify the dbt version.
dbt –version
- Run the following command to create a dbt project.
dbt init <project_name>
- Select a Presto number and enter it. Example: for
[1] presto
, enter1
. - If you already have a project with the same name, you must confirm whether to overwrite profiles.yml. Enter Y to confirm or N to discard.
- Select a Presto number and enter it. Example: for
- Set up the
profiles.yml
file. For more information, see Configuration (setting up your profile). - To test the connection, run:
cd <project_name> dbt debug
- Create a CSV file inside the seeds folder to seed the data into watsonx.data. For example:
id, value 1,100 2,200 3,300 4,400
Note: You might encounter errors when executing seed files because dbt cannot handle all the data types based on the data in the connector. To resolve this, you can explicitly define the data types that dbt should use.Go to <project_name>/dbt_project.yml and add:
For example:seeds: <project_name>: <seed_file_name>: +column_types: <col_1>: datatype <col_2>: datatype
seeds: demo: sample: +column_types: value: VARCHAR(20)
Column names that are specified here should match with the columns in the CSV files.
Important: Do not use extra spaces in CSV files for seeding. If you include extra spaces, you must use same number of spaces while querying that in the models to avoid errors. - Run the seeds by using the following command to create a table and insert the data.
cd <project_name> dbt seed
- In <project_name>/models, you have the models that perform the
operations. By default, dbt sets the operations as
view
. You can create the tables or views by one of the following methods:- Specify inside the models (applicable for that model only)
-
{{ config(materialized='table/view') }}
Note: If this statement is commented out using(--)
, dbt still uses the configuration. To disable it, remove it entirely or comment it in Jinja style ({# … #}
). - Specify in dbt_project.yml (applicable for all models)
Example:models: <project_name>: <model_folders>: +materialized: table/view
models: demo: example: +materialized: table
Note: Onlyselect
statements are supported within models.
Important: The semicolon (;) character is restricted in models. - Run the models by using the following command to create the tables or
views.
You can also specify the tests that you want:cd <project_name> dbt run
For example:models: - name: <model_name> description: "some description" columns: - name: <col_name> description: "some description" data_tests: - <test_name_1> - <test_name_2>
models: - name: my_first_dbt_model description: "A starter dbt model" columns: - name: id description: "The primary key for this table" data_tests: - unique - not_null
Important: Connectors must support Create Table as Select (CTAS) for dbt runs to work. - To generate the documents about the actions performed, run:
cd <project_name> dbt docs generate dbt docs serve
Note: By default, it runs onlocalhost:8080
. To change the port, run:dbt docs serve –-port <port_number>