Azure Data Factory
With Databand, you can track the execution of your Azure Data Factory (ADF) pipelines. Such tracking is done by a monitor, which scans your ADF factories every few seconds and reports on collected metadata from any runs of published pipelines (debug runs are excluded). With the collected metadata, you can enable powerful alerting to notify your data team on the health of your pipeline runs.
Collected metadata
Databand monitors the following metadata types:
Pipeline metadata
- Pipeline state and duration
- Activity state and duration
- Pipeline and activity source code
- Activity input and output JSON
Dataset metadata
- Paths, schemas, and record counts for all reads and writes
- Various metrics that are calculated by ADF such as
copyDuration
,throughput
,usedDataIntegrationUnits
, and more
Alerting capabilities
Check the table to see what alerting functions are supported.
Table 1. Alerting functions that are supported by the ADF integration.
Alert type | Supported | Notes |
---|---|---|
Pipelines | ||
Pipeline state | ![]() |
|
Pipeline duration | ![]() |
|
dbt test | ![]() |
|
Schema change | ![]() |
|
Task state | ![]() |
|
Task duration | ![]() |
|
Custom task metric | ![]() |
Databand collects various metrics from ADF such as copyDuration , throughput ,usedDataIntegrationUnits , and more. Users can alert on any of these metrics. |
Datasets | ||
Missing operation | ![]() |
|
Tables data quality | ![]() |
|
Operations data quality | ![]() |
|
Data delay | ![]() |
Integrating ADF with Databand
Integrating ADF with Databand consists of the following steps:
You can also edit an ADF integration in Databand.
Prerequisites
Before you begin the integration process, you need to have:
- An active Azure subscription
- Access to the Azure portal
- Permissions to manage Azure resources, including the ability to register an app with Microsoft Entra ID
Known issues
-
Databand monitors only runs for the pipelines that were published and triggered either manually or on schedule. Debugging runs that are triggered through the authoring interface are not monitored.
-
Currently, calculating operational lineage across your ADF pipelines is only supported for the following connectors:
- Amazon Redshift
- Amazon S3
- Azure Data Lake Storage Gen2
- Snowflake
Additional connectors are onboarded to these functions throughout the integration's lifecycle. If you have an immediate need to support a specific connector for operational lineage calculations, contact your IBM account representative.
-
For Databand to fully monitor data flow operations, the data flow activity in your pipeline must have the Logging level option set to Verbose.