Understanding data contract fields
Technical and data
The technical and data field is auto populated based on the details of the items that are added to the data product. It describes the schema information of the assets that are added in the data product. Schema supports both a business representation of your data and a physical implementation.
| Field Name | Description |
|---|---|
| Name | The name of the item or data asset. |
| Physical name | Physical name of the item |
| Logical Type | Logical type of the item. ex. Table |
| Column name | Name of the property in a schema |
| Data type | Data type of the property. Ex. string |
| Quality Rules | Producer can add data quality rules as described in section data quality. |
Define data quality rules
Producer can define data quality rules by clicking the “data quality” icon. Data quality rules can be added at the schema level as well as the property level. Data Product Hub supports the following 3 types of data quality rules:
- Text: Text that describes the quality of the data.
- Custom: Quality attributes that are vendor-specific, such as Soda, Great Expectations, dbt tests, or Montecarlo monitors.
- SQL: User can define data quality rules by writing a single SQL query that returns either a numeric or Boolean value for evaluation. This query forms the basis for checking data quality conditions against a specified threshold.
Creating SQL Rules
You have two options for creating SQL rules:
- Manual Entry: Write the SQL query manually based on your data quality requirement in the “query” field.
- Generative AI Assistance: Alternatively, you can generate SQL queries automatically by entering your requirement in natural language. The system uses the IBM Text2SQL component to convert the description into an executable SQL statement.
Note: Generative AI assistance is only available when Data Product Hub is onboarded to Generative AI. For more details, see How to onboard Data Product Hub to generative AI.
Defining Thresholds
Each SQL-based data quality rule can include:
- A threshold operator selected from the dropdown.
- A threshold value entered by the user.
- The SQL query result is compared against the specified threshold that uses the defined operator when the created contract is used for data contract testing
Infrastructure and servers
The servers element describes where the data is protected by this data contract is physically located.
| Field Name | Description |
|---|---|
| Server Identifier | A unique, friendly name for this specific data connection instance (for example: my-postgres-prod). |
| Description | A detailed explanation of what this server connection provides. |
| Environment | The deployment environment for this server (for example: Production, Staging, Development). |
| Type | The underlying technology hosts the data (for example: postgres, snowflake, s3, kafka). |
| Custom Properties | Allows defining nonstandard, tooling-specific, or organization-specific server properties. |
Team
The team's section lists team members and the history of their relation with this data contract.
| Field Name | Description |
|---|---|
| Username | A list of individuals or teams that are involved with the contract. |
| Role | Defines roles for community members, their access level (for example: admin, editor, read, write). |
Pricing
Pricing covers pricing when you bill your customer for using this data product.
| Field Name | Description |
|---|---|
| Amount | The numerical value of the price (for example: 9.95). |
| Currency | The currency in which the price is stated (for example: USD). |
| Unit | The unit of measure the price is based on (for example: megabyte, per hour). |
Service-level agreement (SLA)
Use Object.Element to indicate the number to do the checks on, as in SELECT txn_ref_dt FROM tab1.
Separate multiple object.element by a comma. For example:table1.col1, table2.col1, table1.col2.
If only one object is in the contract, the object name is not required.
| Field Name | Description |
|---|---|
| Default Element | Element (by using the element path notation) to do the checks on. |
| Property | The specific guarantee being defined (for example: latency, retention, frequency). |
| Value | Agreement value for the property. |
Support and communication
Support and communication channels help consumers find help regarding their use of the data contract.
| Field Name | Description |
|---|---|
| Channel Name | The name or address of the support channel (for example: #product-help, datacontract-ann). |
| Channel URL | The direct link to access the support channel (for example: a Slack URL or mailto: address). |
Custom and other properties
Custom properties that a user can add in a data contract.
| Field Name | Description |
|---|---|
| Custom Properties - Property | The name of the nonstandard, tooling-specific, or organization-specific attribute (for example: refRulesetName). |
| Custom Properties - Value | The data value associated with the custom property (for example: gcsc.ruleset.name). |