One of my client engagements involves re-architecting the data and BI platforms for an 80 year old company. As you can imagine, there’s a ton of data across countless legacy systems. Data and systems have piled on top of each other over the decades. The vast majority of the data is undocumented, derived (hence, possibly unreliable), and extremely fragmented. This scenario is a 10/10 on the data brutality scale.
Given this state, how is it possible to create a new data and BI platform? Using derived data is out of the question, as its usually built upon other derived data, all of questionable quality. The simplest thing to do is get back to basics with the data and systems – use the source data.
My time is now spent locating source data using data lineage tools, reviewing lots of SQL, interviewing users, and general good old data detective work. Despite the grunt work involved, I feel good about this approach and its ability to produce a valuable, long lasting solution for this company. Sometimes the hardest path is also the simplest path.