Understanding data contract fields

Technical and data

The technical and data field is auto populated based on the details of the items that are added to the data product. It describes the schema information of the assets that are added in the data product. Schema supports both a business representation of your data and a physical implementation.

Field Name Description
Name The name of the item or data asset.
Physical name Physical name of the item
Logical Type Logical type of the item. ex. Table
Column name Name of the property in a schema
Data type Data type of the property. Ex. string
Quality Rules Producer can add data quality rules as described in section data quality.

Define data quality rules

Producer can define data quality rules by clicking the “data quality” icon. Data quality rules can be added at the schema level as well as the property level. Data Product Hub supports the following 3 types of data quality rules:

  • Text: Text that describes the quality of the data.
  • Custom: Quality attributes that are vendor-specific, such as Soda, Great Expectations, dbt tests, or Montecarlo monitors.
  • SQL: User can define data quality rules by writing a single SQL query that returns either a numeric or Boolean value for evaluation. This query forms the basis for checking data quality conditions against a specified threshold.

Creating SQL Rules

You have two options for creating SQL rules:

  1. Manual Entry: Write the SQL query manually based on your data quality requirement in the “query” field.
  2. Generative AI Assistance: Alternatively, you can generate SQL queries automatically by entering your requirement in natural language. The system uses the IBM Text2SQL component to convert the description into an executable SQL statement.

Note: Generative AI assistance is only available when Data Product Hub is onboarded to Generative AI. For more details, see How to onboard Data Product Hub to generative AI.

Defining Thresholds

Each SQL-based data quality rule can include:

  • A threshold operator selected from the dropdown.
  • A threshold value entered by the user.
  • The SQL query result is compared against the specified threshold that uses the defined operator when the created contract is used for data contract testing

Infrastructure and servers

The servers element describes where the data is protected by this data contract is physically located.

Field Name Description
Server Identifier A unique, friendly name for this specific data connection instance (for example: my-postgres-prod).
Description A detailed explanation of what this server connection provides.
Environment The deployment environment for this server (for example: Production, Staging, Development).
Type The underlying technology hosts the data (for example: postgres, snowflake, s3, kafka).
Custom Properties Allows defining nonstandard, tooling-specific, or organization-specific server properties.

Team

The team's section lists team members and the history of their relation with this data contract.

Field Name Description
Username A list of individuals or teams that are involved with the contract.
Role Defines roles for community members, their access level (for example: admin, editor, read, write).

Pricing

Pricing covers pricing when you bill your customer for using this data product.

Field Name Description
Amount The numerical value of the price (for example: 9.95).
Currency The currency in which the price is stated (for example: USD).
Unit The unit of measure the price is based on (for example: megabyte, per hour).

Service-level agreement (SLA)

Use Object.Element to indicate the number to do the checks on, as in SELECT txn_ref_dt FROM tab1.

Separate multiple object.element by a comma. For example:table1.col1, table2.col1, table1.col2.

If only one object is in the contract, the object name is not required.

Field Name Description
Default Element Element (by using the element path notation) to do the checks on.
Property The specific guarantee being defined (for example: latency, retention, frequency).
Value Agreement value for the property.

Support and communication

Support and communication channels help consumers find help regarding their use of the data contract.

Field Name Description
Channel Name The name or address of the support channel (for example: #product-help, datacontract-ann).
Channel URL The direct link to access the support channel (for example: a Slack URL or mailto: address).

Custom and other properties

Custom properties that a user can add in a data contract.

Field Name Description
Custom Properties - Property The name of the nonstandard, tooling-specific, or organization-specific attribute (for example: refRulesetName).
Custom Properties - Value The data value associated with the custom property (for example: gcsc.ruleset.name).

Learn more

Managing data contracts