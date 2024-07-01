Data lineage is the discipline of understanding how data flows through your organization: where it comes from, where it goes, and what happens to it along the way. Often used in support of regulatory compliance, data governance and technical impact analysis, data lineage answers these questions and more.

Whenever anyone talks about data lineage and how to achieve it, the spotlight tends to shine on automation. This is expected, as automating the process of calculating and establishing lineage is crucial to understanding and maintaining a trustworthy system of data pipelines. After all, the “utopia” of lineage is to automate everything by using various methodologies so that lineage tracking evolves into a hands-off operation without human intervention.

Little is often said about descriptive or manually derived lineage—also often referred to as custom technical lineage or custom lineage—an equally important tool for delivering a comprehensive lineage framework. Unfortunately, descriptive lineage doesn’t get the attention or recognition it deserves. If you say “manual stitching” among data professionals, everyone cringes and runs.

In her book, Data lineage from a business perspective, Dr. Irina Steenbeek introduces the concept of descriptive lineage as “a method to record metadata-based data lineage manually in a repository.”