Special considerations for managed Airflow providers
You can install the dbnd-airflow-auto-tracking
Python package on other managed Airflow providers - Astronomer, Amazon Managed Workflows, or Google Cloud Composer. To integrate them later with Databand, you will also need to provide
their URL in the Databand UI in the last step of the integration process, see Adding and configuring an Airflow integration.
Installing the Python package on Astronomer
With Astronomer, you can build, run, and manage data pipelines-as-code an enterprise scale.
Redeploying the Airflow image triggers a restart of your Airflow scheduler.
-
You can install the
dbnd-airflow-auto-tracking
library by customizing the Astronomer Docker image, rebuilding it, and deploying it, see Customize your image on Astronomer Software. -
In your Astronomer folder, add the following line to your
requirements.txt
file:
dbnd-airflow-auto-tracking==REPLACE_WITH_DATABAND_VERSION
Getting the Astronomer Airflow URL
To get the Astronomer Airflow URL:
- Go to the Astronomer control page and select the Airflow deployment.
- In the Settings tab, click Open Airflow and copy the URL without the
/home
suffix. The URL is provided in the following format:
http://deployments.{your_domain}.com/{deployment-name}/airflow
.
The Astronomer UI shows your {deployment-name} as Release Name.
Installing the Python package on Amazon Managed Workflows
Amazon Managed Workflows is a managed Apache Airflow service that enables setting up and operating end-to-end data pipelines in the AWS cloud at scale.
Saving the following change to your MWAA environment configuration triggers a restart of your Airflow scheduler.
To integrate with AWS:
- Go to AWS MWAA > Environments > {mwaa_env_name} > DAG code in Amazon S3 > S3 Bucket.
- In MWAA’s S3 bucket, update your
requirements.txt
file. - Install the package by entering the following code:
dbnd-airflow-auto-tracking==REPLACE_WITH_DATABAND_VERSION
For more information on integration, see Installing Python dependencies - Amazon Managed Workflows for Apache Airflow.
Getting the MWAA URL
The Airflow URL is located in the AWS Console. To get the URL:
-
Go to AWS MWAA > Environments > {mwaa_env_name} > Details > Airflow UI.
-
Copy the URL from the Airflow UI field. The URL is provided in the following format:
https://<guid>.<aws_region>.airflow.amazonaws.com
Installing the Python package on Google Cloud Composer
With Google Cloud Composer, which is a fully managed data workflow orchestration service, you can author, schedule, and monitor pipelines. For more information, see Google Cloud Composer.
Saving the following change to your Cloud Composer environment configuration triggers a restart of your Airflow scheduler.
To integrate with Google Cloud Composer:
- Go to Google Cloud > Environments > {your_env_name} and click the PyPI Packages tab.
- Click Edit and update your Cloud Composer environment's PyPI package, by replacing
REPLACE_WITH_DBND_VERSION
with the pinned version of the library to install (for example, 1.0.20.1).
dbnd-airflow-auto-tracking==REPLACE_WITH_DBND_VERSION
An updated line of code would look as follows:
dbnd-airflow-auto-tracking==1.0.20.1
For more information, see Installing a Python dependency from PyPI.
Disabling lazily loaded plug-ins in Google Cloud Composer
In the case of Google Cloud Composer disable lazy loaded plug-ins in the following way:
- Go to Composer page > {composer_env_name} > Airflow configuration variables and select your instance name.
- Set the
[core] lazy_load_plugins
option toFalse
. - Submit the entered data.
Getting the Cloud Composer URL
To get the Cloud Composer URL, go to the Google Cloud Console: Composer > {composer_env_name} > Environment Configuration > Airflow web UI.
The URL is provided in the following format:
https://<guid>.appspot.com