Data lineage, end to end.
A practitioner-level hub for everything we've written about tracing data through your stack — what lineage is, why it matters, how to make it operational, and what to do when the tooling fights back.
What is data lineage?
Data lineage is the documented path a data element takes from its origin — operational system, third-party feed, manual upload — through every transformation, aggregation, and consumption point until it lands somewhere a human or system reads it. Lineage answers the question "where did this number come from" and, more importantly, "what breaks if I change this upstream."
Why it matters in 2026
Three forces have made lineage a default expectation rather than an advanced capability:
- Regulatory mandates. BCBS 239 requires risk-data lineage for global banks. The EU AI Act requires training-data lineage for high-risk systems. CCPA and GDPR practically require lineage to answer subject-access and deletion requests.
- Data product economics. When your analytics, reporting, and ML models are themselves consumed as products by other teams, you can't ship a breaking change without knowing who depends on what. Lineage is the dependency graph.
- AI grounding. Every modern data catalog (Atlan, Collibra, Alation, Select Star) now positions lineage as the substrate for AI-driven discovery. Without it, AI assistants over your data hallucinate confidently.
Where to start
If you're new to lineage as a practice, the canonical entry point is our practitioner-level walkthrough of implementing impact analysis without bogging the organization in enterprise paralysis. From there, the articles below cover the specific operational decisions you'll face — how deep to capture, what to do about downstream consumers you don't control, and which tools earn their license cost vs. which can be replaced by dbt + a Snowflake account map.
All articles on data lineage
-
Data Lineage for Compliance: Practical Impact Analysis
Data lineage compliance impact analysis answers two questions regulators ask constantly: where does this data originate, and where does it travel within our sys
-
How to Build a Data Stewardship Program That Lasts
Data stewardship is a set of accountabilities for managing data quality, lineage, and governance in specific domains—but most programs fail because stewards lac
-
Data Lineage Tools: Comparing Commercial, Custom, and Hybrid Approaches (2026)
Compare data lineage tools and strategies for compliance and impact analysis. Learn when to buy, build, or hybrid—and what works at your scale.
-
Data Lineage in Practice: Impact Analysis That Works
Data lineage impact analysis implementation is a system for tracing data from source to consumer, mapping dependencies, and predicting the blast radius of chang
-
Metadata Inheritance: Governance That Scales Down, Not Up
Metadata inheritance in data governance means designing rules so that sensitivity classifications, access controls, and lineage visibility automatically flow do
-
Collibra vs Alation: 2026 Data Governance Buyer's Guide
Collibra vs Alation compared on features, pricing, and fit. Choose the right data governance platform for your catalog, lineage, and stewardship needs in 2026.
-
Is Your MDM "Good Enough"? The Practitioner's Checklist
Use this practitioner MDM checklist to assess master data management maturity across data quality, survivorship, governance integration, and operational health.
-
What Is Collibra? A Practitioner's Guide to the Data Governance Platform
Collibra combines data catalog, governance, lineage, and quality in one platform. This practitioner's guide covers what it does, how to implement it
-
What Is a Data Catalog? The Complete Guide for 2026
A data catalog is the searchable inventory of your enterprise data assets. This 2026 guide covers what it does, how to choose one, and how to drive adoption.
-
Data Governance in Financial Services: 2026 Guide
A practitioner's complete guide to data governance in financial services. Covers BCBS 239, SR 11-7, data lineage, CDEs, governance council design
-
Data Ethics: Navigating the New Frontier in Data Governance
In this article, we delve into the significance of data ethics, explore the key principles that guide ethical data usage
-
Collibra vs Informatica: The Ultimate Comparison
Collibra vs Informatica: a head-to-head comparison of features, pricing, and use cases to help data governance teams choose the right platform
-
Data Lineage - Leveraging the Power of Insight to Revolutionize Data Governance
Data lineage tracks where data comes from, how it moves, and what transforms it. Learn how lineage improves data governance, audit readiness