Like many successful businesses, our client has grown quickly and adopted many different technologies to meet their needs, sustain operations and facilitate future growth. This stage presents many new challenges – with different systems in place and data isolated from one another, it becomes impossible to have a holistic view of your business. Bigger businesses require answers to more difficult questions. Without the correct data strategy in place, opportunities are likely to be missed.
To continue into the modern data age, we analysed data samples from all the various systems and sources, documented them and designed a data lake to unify them. This design includes a master schema, archiving strategies, metadata and documentation stores and data lineage auditing. An in-depth analysis into the existing technologies, data formats and unique data flows was needed to create a single design that could be dynamic for the future and economical on historic data.
Every architected aspect of the data lake was considered in tandem with each other, from ETL processes to extract data at different times from numerous places to partition optimisation for end users’ queries. We also canvassed the market’s current solutions to help our client leverage the cloud and remove the maintenance burden, as well as identify off the shelf tools that would meet their needs without needing to develop a completely custom and expensive alternative.