Data Build Tool (dbt) integration
IBM® watsonx.data integrates with Data Build Tool (db), which is a data analytics tool that helps to transform the data in watsonx.data to simpler and accessible form for business users. It allows analysts and scientists to build data pipelines by using different models and have curated data for decision making. You can run SQL queries by using the db tool and analyse data available in watsonx.data
dbt allows analysts and scientists with some of the following data related tasks:
- Manage complex work flows for data transformation and support features like version control, modular code, and continuous integration.
- Prepare data for reporting and analysis by transforming raw data into a structured format, making it easier to create insights.
- Create layered, reusable models that represent different stages of data transformation.
- Ensure reliability of the transformations by identifying issues in the process.
- Generate clear and easy-to-understand documentation for the models and provide visualization of data lineage to track how data moves through the pipeline.
- Handle dependencies between models and ensure the transformations run in the correct sequence and can integrate with larger data workflow.
dbt is supported in watsonx.data for Spark and
Presto engines. dbt uses the following data build tool (dbt) adapters to connect dbt core with Spark
and Presto engines. The adapters helps to build, test, and document data models.
dbt-watsonx-prestoto connect to Prestodbt-watsonx-sparkto connect to Apache Spark
Basic dbt commands
The following are some basic dbt commands that you can use with both
dbt-watsonx-presto and dbt-watsonx-spark:
- Initialize a dbt project: Set up a new dbt
project.
dbt init my_project - Debug dbt connection: Test your dbt profile and
connection.
dbt debug - Seed data: Load seed data into your
database/datasource.
dbt seed - Run dbt models: Build and run your models.
dbt run - Test dbt models: Run tests on your models.
dbt test - Generate documentation: Create and serve documentation for your dbt
project.
dbt docs generate dbt docs serve
For more information about dbt commands, see dbt command reference.
The following topics provides you with more information about the process of integration.