Integrating with managed Airflow providers
To provide observability over your Airflow DAGs, Databand integrates with different managed Airflow providers. Before you proceed with the integration, make sure you have network connectivity between Apache Airflow and Databand (from Apache Airflow to the Databand Server).
- Self-managed Airflow
- Astronomer
- Amazon Managed Workflows Airflow
- Google Cloud Composer
Instructions for all managed Airflow providers - prerequisites
Before you integrate with managed Airflow providers:
- Disable lazily loaded plugins for Airflow 2+ by changing the
core.lazy_load_plugins=False
configuration settings in theairflow.cfg
file. You can also do this through setting the environment variableAIRFLOW__CORE__LAZY_LOAD_PLUGINS=False
. For more information, see the Airflow's plug-ins documentation website. - When you create
dbnd_config
and you don't have the HTTP type in your Airflow Connection Type, install theapache-airflow-providers-http
package.
Installing the dbnd-airflow-auto-tracking
package on a self-managed Airflow
Complete the following steps to integrate with a self-managed Airflow.
Installing new Python packages on managed Airflow environments triggers an automatic restart of the Airflow scheduler.
Install the Databand Airflow Python package on your self-managed Airflow. To avoid being automatically redirected to a different version of the Python package in case of its updates, perform one of the following steps:
- Run the following
pip
command. Check the Python repository, see dbnd-airflow-auto-tracking for the most recent version of the Python package.
pip install dbnd-airflow-auto-tracking==<version number>
- Add the package to your
Dockerfile
orrequirements.txt
.
After you install the Databand Python package on your self-managed Airflow, you can move to the second step of creating an integration, which is creating the Databand Airflow Monitor DAG. For more information, see Creating the Databand Airflow monitor DAG.