Integrating with managed Airflow providers

To provide observability over your Airflow DAGs, Databand integrates with different managed Airflow providers. Before you proceed with the integration, make sure you have network connectivity between Apache Airflow and Databand (from Apache Airflow to the Databand Server).

  • Self-managed Airflow
  • Astronomer
  • Amazon Managed Workflows Airflow
  • Google Cloud Composer

Instructions for all managed Airflow providers - prerequisites

Before you integrate with managed Airflow providers:

  1. Disable lazily loaded plugins for Airflow 2+ by changing the core.lazy_load_plugins=False configuration settings in the airflow.cfg file. You can also do this through setting the environment variable AIRFLOW__CORE__LAZY_LOAD_PLUGINS=False. For more information, see the Airflow's plug-ins documentation website.
  2. When you create dbnd_config and you don't have the HTTP type in your Airflow Connection Type, install the apache-airflow-providers-http package.

Installing the dbnd-airflow-auto-tracking package on a self-managed Airflow

Complete the following steps to integrate with a self-managed Airflow.

Installing new Python packages on managed Airflow environments triggers an automatic restart of the Airflow scheduler.

Install the Databand Airflow Python package on your self-managed Airflow. To avoid being automatically redirected to a different version of the Python package in case of its updates, perform one of the following steps:

  • Run the following pip command. Check the Python repository, see dbnd-airflow-auto-tracking for the most recent version of the Python package.
pip install dbnd-airflow-auto-tracking==<version number>
  • Add the package to your Dockerfile or requirements.txt.

After you install the Databand Python package on your self-managed Airflow, you can move to the second step of creating an integration, which is creating the Databand Airflow Monitor DAG. For more information, see Creating the Databand Airflow monitor DAG.