Monitoring Azure Data Factory

Instana offers comprehensive monitoring of your Azure Data Factory, providing end-to-end visibility into your environment. After you install the Instana host agent, the Azure Data Factory sensor is automatically deployed and installed. You can view infrastructure metrics that are related to the Azure Data Factory in the Instana UI.

For more information about other supported Azure services, see the Azure documentation.

Supported versions

Instana supports Azure Data Factory version 2.

Configuring the Azure Data Factory sensor

To monitor your Azure Data Factory, you need to first enable the Azure sensor in the agent <agentinstall_dir>/etc/instana/configuration.yaml file as follows. For more information, see Installation.

com.instana.plugin.azure:
  enabled: true
  subscription: "[Your-Subscription-Id]"
  tenant: "[Your-Tenant-Id]"
  principals:
    - id: "[Your-Service-Principal-Account-Id]"
      secret: "[Your-Service-Principal-Secret]"

To configure the Azure Data Factory sensor through the agent configuration file <agentinstall_dir>/etc/instana/configuration.yaml, use the following configuration:

com.instana.plugin.azure.datafactory:
  enabled: false # Valid values: true, false. Enabled (true) by default 
  include_tags: # Comma separated list of tags in key:value format (e.g. env:prod,env:staging)
  exclude_tags: # Comma separated list of tags in key:value format (e.g. env:dev,env:test)
  include_resource_groups: # Comma separated list of resource groups (e.g. rg_prod,rg_staging)
  exclude_resource_groups: # Comma separated list of resource groups (e.g. rg_dev,rg_test)

You can disable the Azure Data Factory sensor and filter it by tags and resource groups.

Disabling the sensor

To disable monitoring of the Azure Data Factory services, use the following configuration:

com.instana.plugin.azure.datafactory:
  enabled: false

Filtering by defining tags and resource groups

You can define multiple tags and resource groups, which are separated by a comma. The tags must be defined as a key-value pair separated by a colon (:). To configure easily, define the tags and resource groups that you want to include in discovery or exclude from discovery. If you define a tag or resource group in both lists (include and exclude), the exclude list has a higher priority. If you don't want to filter services, you must not define the configuration. You don't need to define all values to enable filtering.

To include services by tags into discovery, use the following configuration:

com.instana.plugin.azure.datafactory:
  include_tags: # Comma separated list of tags in key:value format (e.g. env:prod,env:staging)

To exclude services by tags from discovery, use the following configuration:

com.instana.plugin.azure.datafactory:
  exclude_tags: # Comma separated list of tags in key:value format (e.g. env:dev,env:test)

To include services by resource groups into discovery, use the following configuration:

com.instana.plugin.azure.datafactory:
  include_resource_groups: # Comma separated list of resource groups (e.g. rg_prod,rg_staging)

To exclude services by resource groups from discovery, use the following configuration:

com.instana.plugin.azure.datafactory:
  exclude_resource_groups: # Comma separated list of resource groups (e.g. rg_dev,rg_test)

Discovery filtering can be configured on the global level for all Azure services. In defining filters for the Azure Data Factory service, global filters are overridden. For more information about the global discovery filtering of the Azure service, see Azure Configuration.

Viewing metrics

To view the metrics, complete the following steps:

  1. In the sidebar of the Instana UI, select Infrastructure.
  2. Click a specific monitored host.

You can see a host dashboard with all the collected metrics and monitored processes.

Metrics are pulled every minute, which is the resolution that Azure provides for the monitoring of these services.

Configuration data

Factory details Description
Name Factory name
Resource Group Resource group of the factory
Location Factory location
Subscription Id Subscription ID of the factory
Type Type of the resource
State State of the factory
Version Version of the factory
List of pipelines List of all factory pipelines

Performance metrics

Metric Name Unit Aggregation Description
Pipelines Succeeded
Count pipelineSucceededRuns Count Total The total number of pipeline runs that succeeded within a minute window.
Percentage pipelineSucceededRunsPercentage Percent Average The percentage of succeeded pipeline runs within a minute window.
Pipelines Failed
Count pipelineFailedRuns Count Total The total number of pipeline runs that failed within a minute window.
Percentage pipelineFailedRunsPercentage Percent Average The percentage of failed pipeline runs within a minute window.
Pipelines Canceled
Count pipelineCancelledRuns Count Total The total number of pipeline runs that were canceled within a minute window.
Percentage pipelineCancelledRunsPercentage Percent Average The percentage of canceled pipeline runs within a minute window.
Pipelines Total
Count pipelineTotalRuns Count Total The total number of pipeline runs calculated as a sum of succeeded, failed, and canceled runs within a minute window.
Pipelines
Elapsed Time pipelineElapsedTimeRuns Count Total The number of times, within a minute window, that a pipeline runs longer than the user-defined expected duration.
Activities Succeeded
Count activitySucceededRuns Count Total The total number of activity runs that succeeded within a minute window.
Percentage activitySucceededRunsPercentage Percent Average The percentage of succeeded activity runs within a minute window.
Activities Failed
Count activityFailedRuns Count Total The total number of activity runs that failed within a minute window.
Percentage activityFailedRunsPercentage Percent Average The percentage of failed activity runs within a minute window.
Activities Canceled
Count activityCancelledRuns Count Total The total number of activity runs that were canceled within a minute window.
Percentage activityCancelledRunsPercentage Percent Average The percentage of canceled activity runs within a minute window.
Activities Total
Count activityTotalRuns Count Total The total number of activity runs calculated as a sum of succeeded, failed, and canceled runs within a minute window.
Triggers Succeeded
Count triggerSucceededRuns Count Total The total number of trigger runs that succeeded within a minute window.
Percentage triggerSucceededRunsPercentage Percent Average The percentage of succeeded trigger runs within a minute window.
Triggers Failed
Count triggerFailedRuns Count Total The total number of trigger runs that failed within a minute window.
Percentage triggerFailedRunsPercentage Percent Average The percentage of failed trigger runs within a minute window.
Triggers Canceled
Count triggerCancelledRuns Count Total The total number of trigger runs that were canceled within a minute window.
Percentage triggerCancelledRunsPercentage Percent Average The percentage of canceled trigger runs within a minute window.
Triggers Total
Count triggerTotalRuns Count Total The total number of trigger runs calculated as a sum of succeeded, failed, and canceled runs within a minute window.
Runtime Memory
Available integrationRuntimeAvailableMemory Bytes Total The total number of bytes of memory available for the self-hosted integration runtime within a minute window.
Runtime CPU
Percentage integrationRuntimeCpuPercentage Percent Total The percentage of CPU usage for the self-hosted integration runtime within a minute window.
Runtime Queue
Length integrationRuntimeQueueLength Count Total The total queue length for the self-hosted integration runtime within a minute window.
Airflow CPU
Percentage airflowIntegrationRuntimeCpuPercentage Percent Average The percentage of CPU usage for the Airflow integration runtime within a minute window.
Airflow Memory
Percentage airflowIntegrationRuntimeMemoryPercentage Percent Average The percentage of memory available for the Airflow integration runtime within a minute window.
Airflow Tasks
Running airflowIntegrationRuntimeSchedulerTasksRunning Count Total The total number of running scheduled tasks in the Airflow integration runtime executor within a minute window.
Queued airflowIntegrationRuntimeExecutorQueuedTasks Count Total The total number of queued scheduled tasks in the Airflow integration runtime executor within a minute window.
Airflow Operators
Succeeded airflowIntegrationRuntimeOperatorSuccesses Count Total The total number of Airflow integration runtime operator successes within a minute window.
Failed airflowIntegrationRuntimeOperatorFailures Count Total The total number of Airflow integration runtime operator failures within a minute window.
Airflow Triggers
Succeeded airflowIntegrationRuntimeTriggersSucceeded Count Total The total number of Airflow integration runtime triggers that succeeded within a minute window.
Running airflowIntegrationRuntimeTriggersRunning Count Total The total number of Airflow integration runtime triggers that are running within a minute window.
Failed airflowIntegrationRuntimeTriggersFailed Count Total The total number of Airflow integration runtime triggers that failed within a minute window.
Airflow Jobs
Succeeded airflowIntegrationRuntimeJobStart Count Total The total number of Airflow integration runtime jobs that succeeded within a minute window.
Running airflowIntegrationRuntimeJobEnd Count Total The total number of Airflow integration runtime jobs that are running within a minute window.
Failed airflowIntegrationRuntimeJobHeartbeatFailure Count Total The total number of Airflow integration runtime job heartbeat failures within a minute window.
Airflow DAG Processing
Last Duration airflowIntegrationRuntimeDAGProcessingLastDuration Milliseconds Average The average duration of the last DAG processing in the Airflow integration runtime within a minute window.