Runs
Overview
The Runs page displays all runs across all pipelines that have been observed by Databand. You can switch between active and archived runs by using the tabs at the top of the page.
Runs can be filtered using the Add Filters button at the top of the page. Available filters include:
- Source - Your data source integrations. This generally corresponds to the list of integrations you see on the Integrations page in your environment, although some exceptions do exist.
- Project - Any project tags that have been parsed from your monitored pipelines.
- Pipeline - All pipelines that have been monitored by Databand. This can be used to restrict the runs list to only a specific pipeline. When clicking on a pipeline name from the Pipelines page, the Runs page will automatically be filtered to the pipeline that was clicked.
- Run State - The final run state reported to Databand by the integration. When debugging a failed run, it may be helpful to filter this list to only failed runs to determine whether any recurring issues might exist.
- Time Range - The window of time you want to use to limit the list of runs. By default, this page will present runs for the past 7 days.
Runs List
Run details are summarized in table format. The below fields are included:
- Run Name - The timestamp of the run, the pipeline/project/source for the run, and the user who triggered the run. Clicking the timestamp of the run will take you to the Run Overview page.
- End Time - The time the last run completed.
- Duration - The duration of the last run.
- Fired Alerts - Displays the total number of outstanding (i.e. triggered or acknowledged but not resolved) alerts associated with the run. Mousing over this value will show the breakdown of alert totals by alert severity. Clicking this value will take you to a list of the outstanding alerts.
- State - The final state of the run as reported by the data source integration.
- Tasks Progress - A summary of the task states for the run. Mousing over a state will show all tasks that reported that state in the run. Clicking any of the task state counts will take you to the Run Overview page.
Archiving Runs
Runs can be archived in Databand by checking one or more boxes at the far left of the runs list and then clicking the Archive button that appears above the top left of the table. The concept of archiving a run in Databand is tied primarily to the visibility of that run in the Databand UI.
Archived runs can still be viewed at any time by switching to the Archived tab at the top of the Runs page. If you wish to unarchive a run, simply go to the Archived tab, check one or more boxes at the far left of the runs list, and then click the Unarchive button that appears above the top left of the table.
Run Overview
The Run Overview page shows all metadata collected for a single run of your pipeline and can be reached by clicking on the timestamp of a run from the runs list on the Runs page. This screen is divided into halves: A metadata summary on the left, and a graphical view of your pipeline on the right.
The metadata summary on the left is divided across multiple tabs:
- Overview - A high level summary of your run including a run duration trend graph, information about failed tasks, and access to in-depth metadata such as pipeline parameters.
- Metrics - A list of all metrics collected as part of the current pipeline run. Metrics include those natively collected by Databand such as states, durations, and data set profiles, but it can also include custom metrics that you define within your code. Each metric belongs to a task in your pipeline. You can quickly create an alert on any metric from this list by clicking the bell icon to the right of the metric name.
- Logs - Any logs collected from your pipeline run. Most of the time, logs are specific to tasks.
- Code - The source code for your pipeline. In some cases, task-level code may be available. This can include things such as the Python code behind your Airflow DAG, the JSON definition of your ADF pipeline, or the command submitted to dbt.
- Data Interactions - This is the summary of all inputs and outputs logged during your pipeline run. Nearly any data set operation should report the data set type, path, schema, and record count at a minimum. In cases where dataframes have been logged in Spark or pandas, you can drill into the schema of a data set operation to see column-level statistics.
The graphical view of your pipeline on the right half of the screen will show the flow between your tasks as well as the state and duration of each task. Clicking on a specific task in the graph will filter all metadata presented on the left half of the screen to only the metadata collected during that task. For example, if you only want to see the data set operations that were logged as part of a single task in your pipeline, you can switch to the Data Interactions tab on the left and then click on the task name in the graph to the right.