Special considerations for managed Airflow providers

You can install the dbnd-airflow-auto-tracking Python package on other managed Airflow providers - Astronomer, Amazon Managed Workflows, or Google Cloud Composer. To integrate them later with Databand, you will also need to provide their URL in the Databand UI in the last step of the integration process, see Adding and configuring an Airflow integration.

Installing the Python package on Astronomer

With Astronomer, you can build, run, and manage data pipelines-as-code an enterprise scale.

Redeploying the Airflow image triggers a restart of your Airflow scheduler.

  1. You can install the dbnd-airflow-auto-tracking library by customizing the Astronomer Docker image, rebuilding it, and deploying it, see Customize your image on Astronomer Software.

  2. In your Astronomer folder, add the following line to your requirements.txt file:

dbnd-airflow-auto-tracking==REPLACE_WITH_DATABAND_VERSION

Getting the Astronomer Airflow URL

To get the Astronomer Airflow URL:

  1. Go to the Astronomer control page and select the Airflow deployment.
  2. In the Settings tab, click Open Airflow and copy the URL without the /home suffix. The URL is provided in the following format:

http://deployments.{your_domain}.com/{deployment-name}/airflow.

The Astronomer UI shows your {deployment-name} as Release Name.

Installing the Python package on Amazon Managed Workflows

Amazon Managed Workflows is a managed Apache Airflow service that enables setting up and operating end-to-end data pipelines in the AWS cloud at scale.

Saving the following change to your MWAA environment configuration triggers a restart of your Airflow scheduler.

To integrate with AWS:

  1. Go to AWS MWAA > Environments > {mwaa_env_name} > DAG code in Amazon S3 > S3 Bucket.
  2. In MWAA’s S3 bucket, update your requirements.txt file.
  3. Install the package by entering the following code:
dbnd-airflow-auto-tracking==REPLACE_WITH_DATABAND_VERSION

For more information on integration, see Installing Python dependencies - Amazon Managed Workflows for Apache Airflow.

Getting the MWAA URL

The Airflow URL is located in the AWS Console. To get the URL:

  1. Go to AWS MWAA > Environments > {mwaa_env_name} > Details > Airflow UI.

  2. Copy the URL from the Airflow UI field. The URL is provided in the following format: https://<guid>.<aws_region>.airflow.amazonaws.com

Installing the Python package on Google Cloud Composer

With Google Cloud Composer, which is a fully managed data workflow orchestration service, you can author, schedule, and monitor pipelines. For more information, see Google Cloud Composer.

Saving the following change to your Cloud Composer environment configuration triggers a restart of your Airflow scheduler.

To integrate with Google Cloud Composer:

  1. Go to Google Cloud > Environments > {your_env_name} and click the PyPI Packages tab.
  2. Click Edit and update your Cloud Composer environment's PyPI package, by replacing REPLACE_WITH_DBND_VERSION with the pinned version of the library to install (for example, 1.0.20.1).
dbnd-airflow-auto-tracking==REPLACE_WITH_DBND_VERSION

An updated line of code would look as follows:

dbnd-airflow-auto-tracking==1.0.20.1

For more information, see Installing a Python dependency from PyPI.

Disabling lazily loaded plug-ins in Google Cloud Composer

In the case of Google Cloud Composer disable lazy loaded plug-ins in the following way:

  1. Go to Composer page > {composer_env_name} > Airflow configuration variables and select your instance name.
  2. Set the [core] lazy_load_plugins option to False.
  3. Submit the entered data.

Getting the Cloud Composer URL

To get the Cloud Composer URL, go to the Google Cloud Console: Composer > {composer_env_name} > Environment Configuration > Airflow web UI.

The URL is provided in the following format: https://<guid>.appspot.com