What is Data Lineage?


In this post I want to talk about something that sounds a bit daunting but is actually super helpful when it comes to Data Governance, and that is Data Lineage. 

What is Data Lineage? 

In its simplest form, Data Lineage can be thought of as a diagram that shows you how data flows through an organisation from the first point that it comes in at.

For example, imagine a customer placing an order on a website. That's where the data journey begins. Then it might travel through various systems like order processing and inventory management, before landing in an organisation’s data warehouse for reporting. 

Now, that is a very straightforward example and of course things can get more complex than that, but the purpose of Data Lineage remains the same - to show what systems and processes your data goes through no matter how simple or complex. 


The benefits and challenges of Data Lineage

Sometimes data takes unexpected routes when it is being moved from system to system, which can lead to hiccups. That's where Data Lineage comes in handy. It can help you spot potential issues and understand how your data is flowing.

Nevertheless, creating Data Lineage diagrams can be challenging at times. There are tools made specifically to help with these challenges. Automated tools can scan your databases and do Data Lineage for you. The problem with this is that they often churn out tons of detailed diagrams that can be overwhelming if this level of detail is not needed. 

My advice? Keep it simple.

Start by focusing on the most important data for your organisation and work backwards. Ask those who use that data where they get it from, then follow the breadcrumbs all the way back. I say this because it's really hard to work forwards when you're trying to create a Data Lineage if it's never been documented before. 

Another thing I'd recommend if you're perhaps not sure where your data starts is to talk to some experienced long standing business analysts in your organisation. They probably have some good ideas about where data is flowing through. 

So, there you have it. Data Lineage isn't scary - it's actually fairly simple to create high level Data Lineage diagrams when you break it all down first.

Prefer this content in video form? Click here to watch the video.

If you found this helpful and would like to know more about Data Governance, feel free to book a call with me.

Comment