Glossary

This glossary provides terms and definitions for Databand.

This glossary provides terms and definitions for Databand.

  • See refers you from a nonpreferred term to the preferred term or from an abbreviation to the spelled-out form.
  • See also refers you to a related or contrasting term.

A

agent
An installed component that runs operations on a computer in a customer environment.
aggregation
The process of collecting, interpreting, and sorting data from various locations into a single file.
alert
A notification that is triggered by specific conditions or events within a dataset, pipeline, or task execution.
alert condition
The statement of criteria that trigger an alert.
alert definition
The criteria that describe how to send an alert and who to send it to.
alert payload
A JSON object that transmits information about an alert to an alert receiver.
See also receiver.
annotate
To add metadata to an object to describe services and data.
A deviation from the expected baseline behaviors.
See also anomaly alert.
anomaly alert
An alert that is triggered when a pipeline run or task has a value that is anomalous, as defined by previous task or pipeline runs.
See also anomaly.
anomaly detection
The process of monitoring and isolating activity that falls outside of normal patterns across time, location, and user and traffic behavior.
API
See application programming interface.
API call
A request for information from one API to the endpoint of another API.
API credential token
See personal access token.
API endpoint
The address of an API or service in an environment. An API exposes an endpoint and at the same time invokes the endpoints of other services.
API key
A unique code that is passed to an API to identify the calling application or user, and to send tracking data. An API key is used to track and control how the API is being used, for example, to prevent malicious use or abuse of the API.
app integration
The connection of different applications to transfer and synchronize data and processes.
application programming interface (API)
An interface that allows an application program that is written in a high-level language to use specific data or functions of the operating system or another program.
asset
A manageable object that is monitored within the Databand environment.
See also pipeline, dataset, source.
authentication
The process of validating the identity of a user or server.
AuthN
See authentication.

C

client
A software program or computer that requests services from a server.
collaborator
The person or group of people who are assigned to an alert. Assigning collaborators allows a user to more easily find the alerts that they own or are relevant to them.
column
The vertical component of a database table. A column has a name and a particular data type (for example, character, decimal, or integer).
See also schema.
component
An image, binary, or source code repository that is included in an application definition.
custom task metric alert
An alert based on a metric that the user defines, by using the log_metric function, in order to track any specific criterion.

D

DAG
See directed acyclic graph
data at rest
Data that is stored and not moving from one location to another.
See also data in motion.
data count
The number of rows that are present in a dataset.
data delay alert
An alert based on a metric that detects if an update to a dataset completed by a predefined daily or hourly target.
data in motion
Data that is actively being transferred from one location to another.
See also data at rest.
data platform
A combination of tools that collect, organize, and analyze data.
data quality alert
An alert based on a metric that identifies the quality of a dataset, including the detection of null and duplicate values.
data quality query
An inquiry that uses SQL queries to identify anomalies or other issues in datasets.
data quality validation
In Databand, SQL-based checks and rules that are used to assess the quality of data within data pipelines. The checks are designed to validate and ensure that data meets specific quality standards before it moves further along the pipeline.
data quality validation rule
The specific SQL expression that is used in a query to check the quality of datasets.
dataset
A structured collection of data that serves as an input or output for tasks within a pipeline.
See also pipeline.
dataset metric
A quantitative value that is used to assess the performance, quality, or status of a dataset.
dataset operation
An action that is performed on a dataset, including reading, writing, and processing.
dataset path
The URI that is associated with the logged data set.
dataset record count
A report of how many rows, also called records, are present in a specific dataset.
data stack
A combination of software and hardware tools that are used for collecting, storing, processing, analyzing, and visualizing data.
data warehouse
A large, centralized repository of data that is collected from various sources that is used for reporting and data analysis. It primarily stores structured and semi-structured data, enabling businesses to make informed decisions.
data warehouse integration
In Databand, an integration that provides end-to-end observability of a data stack. This includes defining data quality alerts for data at rest, verifying that updates are happening on time, and monitoring both individual transaction volume and the total overall volume of tables over time.
directed acyclic graph (DAG)
A graph with no path that starts and ends at the same vertex, which provides a visual representation of the sequence and dependencies of tasks within a single pipeline run. Each task in the pipeline is depicted as a node, and the edges (arrows) between them indicate the order and dependencies, showing the flow of execution.
See also pipeline.

E

ELT
See extract, load, and transform.
ETL
See extract, transform, and load.
extract, load, and transform (ELT)
The process of extracting data from one or more sources, loading it directly into a relational database, and then using the database engine to run data transformations.
See also extract, transform, and load.
extract, transform, and load (ETL)
The process of collecting data from one or more sources, cleansing and transforming it, and then loading it into a database.
See also extract, load, and transform.

F

facet
In Databand, a metadata item that is based on the OpenLineage platform and can be added to a run, job, or dataset to add context.
filter dimension
A category by which results can be filtered. For example, filter dimensions for pipelines in Databand include project and source.

G

group
A logical organization of users whose membership allows them to perform the same activities or provide the same authority to access resources.
See also role-based access control.

H

header
A part of an API call that includes metadata that is related to the request.

I

integration
A connection to an external source to enable the movement of metadata from an external service. In Databand, integrations come in the form of persistent monitors, SDK implementations, or API.

J

job
A specific pipeline definition that runs periodically, as determined by an external system.
See also orchestration.

L

library
A collection of model elements, including business items, processes, tasks, resources, and organizations.
lineage
The visualization of data flow and dependencies across various datasets within and outside of a pipeline. It provides a comprehensive map of how data moves through the system and shows the relationships between datasets.
live view
A page of the Databand UI that displays all currently running pipelines, and any anomalies that were detected on them.
log
A record that captures detailed information about events, operations, and interactions that occur within data pipelines and workflows.

M

mean time to detect (MTTD)
The average amount of time it takes to identify a failure.
mean time to resolve (MTTR)
The total time period from the start of a failure to when the failure resolves and operations resume.
metadata
Data that describes the characteristics of data; descriptive data.
metadata store
A database that stores all metadata collected from the systems that the user integrates with Databand, which includes pipeline and dataset metrics, pipeline and task logs, and the code behind pipeline definitions.
metric
A numerical value that is used to assess the performance, quality, or status of datasets, tasks, and pipelines.
missing operation alert
An alert based on a metric that compares dataset runs to historical runs and detects whether the current run lacks a particular operation
monitor
An entity that performs measurements to collect data pertaining to the performance, availability, reliability, or other attributes of applications or the systems on which the applications rely. These measurements can be compared to predefined thresholds. If a threshold is exceeded, administrators can be notified or predefined automated responses can be performed.
MTTD
See mean time to detect.
MTTR
See mean time to resolve.

O

operations data quality alert
An alert based on a metric that tracks data quality for the datasets that are logged in pipelines.
orchestration
The automated configuration, management, and coordination of computer systems, applications, and services.
origin
The pipeline, dataset, task, or run from which an alert is created.
outstanding alert
An alert that was triggered or acknowledged but not yet resolved.

P

payload
The body of a message that holds content.
personal access token
A value used by the consumer to gain access to protected resources on behalf of the user, instead of using the user's service provider credentials.
See also API credential token.
pipeline
A structured sequence of data processing tasks that is performed on a dataset.
See also asset, directed acyclic graph.
pipeline dbt test alert
An alert based on a metric that detects the failure of a dbt test.
pipeline duration alert
An alert based on a metric that detects a pipeline duration that falls outside of defined limits or is anomalous.
pipeline state alert
An alert based on a metric that detects when a pipeline failed, succeeded, or entered another specific state.
pipeline schema change alert
An alert based on a metric that compares dataset schemas between runs to detect new columns, removed columns, and data types in columns.
See also schema.
pipeline SLA alert
An alert based on a metric that detects a pipeline that has not started or completed according to predefined SLA parameters.
See also service level agreement.
POST
In HTTP, a parameter on the METHOD attribute of the FORM tag that specifies that a browser will send form data to a server in an HTTP transaction separate from that of the associated URL.
privilege
The capability of performing a specific function, sometimes on a specific object.
project
A group of pipelines from a source that are integrated for monitoring.

R

RBAC
See role-based access control.
receiver
A channel that captures and processes alerts or notifications that are triggered within a run.
record
The storage representation of a row or other data.
resource unit (RU)
A measure of the amount of data that a specific pipeline uses.
role
A set of permissions or access rights.
See also role-based access control.
role-based access control (RBAC)
The process of restricting integral components of a system based on user authentication, roles, and permissions.
See also group, role.
root cause analysis
The process of determining the first, or root, cause of a system failure, based upon the examination of the total set of problem-related artifacts within the system.
RU
See resource unit.
run
An execution instance of a pipeline, representing one specific occurrence.

S

schema
A structured collection of data that serves as an input or output for tasks within a pipeline.
See also column.
SDK
See software development kit.
SDK configuration
The collection of parameters that are specified within the source system, and which instruct the Databand SDK on how to connect to the Databand web application and specify which metadata to collect. Definitions for these parameters can be built directly into the user’s code as part of a Databand tracking context or can be set by using environment variables.
self-hosted Databand
An on-premises Databand deployment that is managed by the customer in their data center. The customer owns and maintains the data, computing network, and application deployment.
sensitivity
The amount by which a threshold-based indicator must exceed its threshold before an alert is generated.
service level agreement (SLA)
A contract between a customer and a service provider that specifies the expectations for the level of service with respect to availability, performance, and other measurable objectives.
SLA
See service level agreement.
software development kit (SDK)
A set of tools, APIs, and documentation to assist with the development of software in a specific computer language or for a particular operating environment.
source
An outside platform or tool that is integrated with Databand to track assets.
See also asset.

T

tables data quality alert
An alert based on a metric that tracks the data in a specific database and triggers an alert if the column data meets the chosen condition.
tag
An identifier that groups related artifacts.
task
A data processing operation such as data extraction, transformation, or loading within a run or pipeline.
task duration alert
An alert based on a metric that detects a task duration that falls outside of defined limits or is anomalous.
task state alert
An alert based on a metric that detects when a task failed, succeeded, or entered another specific state.
trend
A series of related measurements that indicates a defined direction or a predictable future result.

W

widget
A graphic element, such as a chart or grid, that displays a particular type of information in a dashboard or workspace.