Achieving this holistic, proactive view comes down to five key steps of data observability:
- Pipeline execution: Is data flowing? This is table stakes, and you need to be able to confirm the answer for all the hundreds or thousands of pipelines you’re observing.
- Pipeline latency: Is data arriving on time? If you expect a run to take two minutes but it actually took five hours (or even five minutes), you’re in breach of SLAs.
- Data structure: Is the data shape valid and complete? If the data includes an old record or an incorrect value, then it’s not accurate and can lead to faulty decision-making.
- Data content: Are there significant changes in the data profile? Ending up with six columns when you only expected five is a serious issue.
- Data validation: Does the data conform to how it’s being used? If not, then there’s a mismatch in terms of what teams need and what they’re getting.
The first four of these steps should provide real-time alerts to data platform and engineering teams when anything goes wrong, that way they can monitor data SLAs. Meanwhile, the last two of these steps should deliver custom metrics that data analytics and science teams can manage to ensure reliability.
How Databand can help
Databand empowers data platform teams to deliver reliable and trustworthy data. In other words, it allows you to catch bad data before it impacts your business.
Specifically, Databand collects metadata from all key solutions in the modern data stack, builds a historical baseline based on common data pipeline behavior, alerts on anomalies and rules based on deviations and resolves through triage by creating smart communication workflows. In doing so the Databand platform supports process quality (pipeline states, pipeline job performance, pipeline latency), data quality (data structure, data content, data freshness) and impact analysis and lineage (relationships between data and pipelines, maps of causes and impacts). As a result, Databand empowers teams to:
- Detect earlier: Pinpoint unknown data incidents and reduce mean time to detection (MTTD) from days to minutes.
- Resolve faster: Improve mean time to resolution (MTTR) with incident alerts and routing from weeks to hours.
- Deliver trusted data: Enhance reliability and data delivery SLAs by providing visibility into pipeline quality issues.
Now, the power of Databand will be available via a new integration with DataStage CP4D, IBM data fabric’s powerful cloud-native ingestion service. This integration will support:
- Real-time detection and alerting on incidents in Datastage flows
- 360 impact analysis using Databand’s runtime incident lineage to view how DataStage incidents impact downstream data
- Historical trends of different DataStage processes to detect anomalies and incidents, removing bad data surprises
Databand in action
To bring it all together, let’s take a look at how Databand makes proactive data observability a reality with two real-life use cases.
Diagnosing a data quality issue with Databand and Apache Airflow
In this case, let’s say we pull in data from New York City, with a column for each borough. But when the data comes through, we see six columns. This is an issue since we know there are actually five boroughs. While this instance is obvious, in the day-to-day handling of large datasets, issues like these typically won’t be so glaring. Databand flags this as a data quality issue, and we can investigate by clicking on the alert.