Project settings for data quality rules
For consistent setup of data quality rules, you can configure default settings that can be applied to any data quality rule in the project.
These project settings are available if the data quality feature is enabled for watsonx.data intelligence.
- Required permissions
- To configure data quality default settings, you must have the Admin role in the project. Any project collaborator can view the settings.
To access the default settings, go to the project's Manage page and select Tools > Data quality.
Output tables
For consistent setup of output tables for data quality rules, you can define a default configuration.
Output type and location
Define a new output table or select an existing table to write rule output to. For supported database types, see Supported data sources for curation and data quality.
When you define a new table, the table name can be a user-defined name, a parameter for dynamically creating a name, a combination of user-defined name and parameter, or a combination of parameters.
User-defined table names must follow this convention:
- The first character for the name must be an alphabetic character.
- The rest of the name can consist of alphabetic characters, numeric characters, or underscores.
- The name must not contain spaces.
For dynamic name creation, you can use these parameters:
#execution_id##rule_id##rule_id##rule_name##project_id##job_id##job_run_id##rule_id#
For the parameters with changing values, a new table might be created:
- For
#job_run_id#for each rule run - For
#execution_id#if the rule is run from the data quality rules UI or via API call
When you configure the table name, keep in mind that target databases might have length limits for table names. Individual parameters or a combination of parameters can generate table names that might exceed the allowed length. For example, a rule name can have up to 256 characters, but the target database might not support names of that length.
Also, you must make sure that the output table names are unique in the data source. Especially dynamically created names can't be checked up front for name collisions.
Additionally, you can select these options:
-
Create table only when issues are found
This option avoids that empty tables are created in cases where a rule doesn't produce output records. However, if a table with that name already exists because it was generated for an earlier rule run, the table remains unchanged.
-
Import generated output table as project asset
To enable easy access to the rule output, add new rule output tables as data assets to the project. Instead of running a database query, you can view the data by opening the data asset from the Assets page in your project or from the rule's run history.
This option is enabled by default.
To make this configuration available for use in data quality rules, save it. The configured table is then shown as Current.
You can update this configuration at any time. These updates are then applied to new rules and new runs of existing rules that are configured to inherit project settings.
Explain data quality rules with AI
This feature is available in deployments where generative AI is enabled. For more information, see Preparing to install IBM watsonx.data intelligence in the IBM Software Hub documentation.
Use AI to generate plain-text English descriptions and explanations for data quality rules and the used rule expressions to help users understand the purpose of a rule without the need to understand complex SQL statements or technical language.
This capability is turned on by default, but takes effect only if the project is enabled for use of generative AI in watsonx.data intelligence.
Generated descriptions are automatically updated whenever the rule logic (expressions, bindings, SQL statements) changes.
You can disable generation of rule and rule expression descriptions at any time. However, if you already have AI-generated descriptions, these descriptions are no longer automatically updated for changes to rule expressions or bindings.
For projects that were created before this capability was available or if you reenable the option, you can automatically update the existing data quality rules with AI-generated descriptions and expression explanations. However, if you manually added or updated descriptions, these descriptions remain unchanged.