Data is the lifeblood of the business but its value lies in being able to quickly turn it into insights. It’s been described by the Economist as “to this century what oil was to the last one: a driver of growth and change” but it can often prove just as difficult to mine. The focus for many businesses now is to improve their data maturity, by looking at how they can optimise their data and use it to deliver real-time insights.
There’s a continual drive to reduce latency and time to insight through the use of practices, processes and technologies, commonly referred to as DataOps, which seeks to improve analysis. It’s a driving force that has recently seen data management turned on its head. Before, the typical approach was centralised data warehousing that saw using Extract Load and Transform (ETL) processes used to copy it from domains into a giant data lake. But the emphasis now is on decentralising data to create pools that remain in the ownership of the domain.
Deliberating Data Mesh
An example of this in action is Data Mesh. The brainchild of Zhamak Dehghani who has founded a stealth tech startup of the same name dedicated to reimagining data platforms with Data-Mesh-native technologies, Data Mesh sees each domain to handle their own data pipelines while the mesh provides a consistency in terms of syntax and standards. (It differs from Data Fabric in this respect because it also embraces a federated governance operating model).
Data Mesh effectively enables data to be consumed as a product and has spawned the concept of Data Infrastructure-as-a-Platform. It’s been compared to the shift in software engineering to microservices because it’s marked a seachange in how big data is managed and stored, prompting large businesses to embark on data transformation projects in a bid to access their data more rapidly, sustainably and at scale. But moving to Data Mesh needs careful consideration.
Zhamak cautions that implementing Data Mesh is not just a technological undertaking but requires cultural change too within the enterprise. The organisation needs to embrace new behaviours as well as new technology to fully reap the benefits. It’s an approach that seeks to facilitate the sharing of data in each operational domain and in so doing has the power to close the gap that exists between operations and analysis. But for that to happen, practitioners, architects, technical leaders and decision makers need to be involved in its adoption, making it a sociotechnical undertaking.
Demand for data engineers
Overseeing any data transformation project has always involved some aspect of data engineering. Previously responsible for the preparation of data for analysis, these days data engineers occupy centre stage in data transformation. From developing or selecting data products and services to their integration into existing systems and business processes, they determine what the Modern Data Stack (MDS) will look like. Consequently, Data Engineers are now in high demand but the shift in their remit means they also need to look to upskill to ensure they can effectively collaborate with development teams and meet the needs of data analysts.
Vendors are also continually pushing the boundaries of what can be achieved, with the likes of Google, SAP and Select Star to DBT and Snowflake reinventing ways in which data can be stored, accessed and analysed more efficiently. The resulting cloud-based platforms can support numerous data types and feature analysis tools that enable analysis during the ETL process, for example.
The desire for real-time access to data has also given rise to Fast Data which sees data analysed as it is created. Fast Data sees batch processing replaced by event-streaming systems, promising instant insights. But there are, of course, other issues that need to be considered as everything becomes bigger, better, faster, more.
Security and GRC
Aside from the technological issues – such as micro-partitioning, syntax issues or discrepancies, and monitoring for any data errors during the conversion process – there’s also the need to consider security and data governance.
Managing data in observance with data governance frameworks is a must and consideration should also be given to how this will be done in the face of changing regulatory requirements. Businesses need to be able to document how their data is used but while this used to be a laborious and time-consuming process there are now a litany of solutions out there for automating every aspect of the process. And encapsulating it all is the data strategy, which describes how the business manages its people, policies and culture around its data.
So how do you go about moving from a centralised data architecture to Data Mesh? How can you optimise your Data Mesh? How should you build the team in such a way that data engineers and analysts work together? How do you measure your data maturity and use this to steer future projects? Is Fast Data for you? And how can you ensure you continue to observe security and governance requirements, particularly in a decentralised architecture?
Hear it straight from the experts
Finding out the answers to these questions requires access to the best minds in the business. At the Big Data LDN Conference and Exhibition, held from 21-22 September at Olympia in London, you can hear from over 200 expert speakers across 12 technical and business-oriented theatres, all of whom are focused on how to build the dynamic, data-driven enterprise.
Zhamak Dehghani will be opening the event as the keynote speaker and will deliver her session Rewire for Data Mesh: atomic steps to rewire the sociotechnical backbone of your organisation at 10am on 21 September while other experts will be sharing unique stories, peerless expertise and real-world use cases. Starburst, Snowflake, LNER, Deliveroo, Microsoft, Actian, Confluent, Dataiku and Deloitte will be taking a deep dive into topics ranging from Modern Analytics and DataOps to Data Governance and AI & MLOPS.
Big Data LDN is free to attend but you’ll need to register. To secure your place, please do sign up via the website at bigdataldn.com.