Data Platforms, Data Reliability

4 Data Mesh Principles To Get One Step Closer To Data Nirvana

data mesh principles futuristic image

Michael Segner

Michael writes about data engineering, data quality, and data teams.

It’s common to see organizations move towards a data mesh approach for the flexibility and agility that comes from microservices, but when you’re tasked with designing and building a data mesh, it can be difficult to know where to start. What exactly does a data mesh approach look like?

Just as with microservices, there is no one single recipe for building a data mesh. Instead, there are core principles that guide you. Here are four principles–based on Zhamak Dehghani’s original definition–that will help you build a data mesh that’s suitable for your business.

Table of Contents

1. Domain-Driven Data Ownership and Architecture

As one of the key tenets of a data mesh, domain-driven architecture is a vital part of changing how stakeholders view data. Historically, data that’s been gathered has often been stored in a monolithic data lake that’s difficult to access or analyze without help from qualified individuals.

With a data mesh, data domains instead host and serve their own datasets. This means that individual departments take ownership over how their data is stored and analyzed. The aim here is to make data more easily consumable and, ultimately, empower stakeholders.

On a wider scale, this also means that departments are able to view and use related data that has been collected and managed by other departments without having to dive into a monolithic and difficult to navigate data lake.

2. Data as a Product

It’s easy to think of data as something that’s gathered to illuminate the results of certain actions: “we did X and Y happened as a result.” But, in addition to explaining outcomes, data can also inform future decisions and actions: “Y happened, so we should think about doing X.”

Recent years have seen the idea of treating data as a product gathering momentum (as opposed to treating it as something distinct from internal or external products used within the company). 

But why is the Data as a Product (DaaP) model powerful?

There are several different answers to that question:

  • It elevates the data team and data to a first class citizen and resourced as such. It respects the ability of data to not just provide an internal team’s a competitive advantage but drive revenue through external data products and/or monetization.
  • It raises the bar and standards for data governance, usability, and quality. SaaS products regularly talk about reliability and uptime in terms of the five 9s or 99.999% uptime. Most data dashboards and products experience considerably more data downtime.
  • Just like products have product managers to coordinate across teams, create a forward-looking vision, and prioritize features on a roadmap, so too should data products.
  • The data product paradigm creates a focus on the adoption and business value of distinct data assets. 
  • Data products should empower users with easy self-service (more on that below).
How to build data products guide
Access our data products eBook for more details on execution.

3. Self-Serve Data Platform

One key outcome of successful data mesh use is that data becomes “self-service.” As a result, employees who would usually need to wait on a central data team for days can take action themselves instead.

Through the use of smart data integrations and other tools, individuals can more easily measure and analyze in real time. And, when different departments can dive into the data they need without having to wait around, they can work in a more agile way. 

As the group head of data engineering at the Sanne Group explained in this blog post, he is moving his team toward a data mesh partly to accelerate time to value via self-service.

“The other big driver is speed. When you have a centralized data team structure, the journey from an analyst’s question to an answer might travel through seven distinct steps across different members of the team each of whom have a SLA of 24 to 48 hours. That adds up!

Self-service is an incredibly daunting and fraught objective. Many of the modern data infrastructure providers describe a central platform where all data lands from all different sources. But then how do you build an interface, how do you use an analytical tool, to serve all use cases for all audiences across all domains? 

I think the reality, at least for us in the short-term, is that you have to look at it by use case and bring the data to the user rather than bring the user to the data. What I mean by this is that each domain has their own set of tools they are comfortable using to access and analyze data.”

A good data mesh implementation plan can increase velocity by condensing steps and avoiding stories like the one above. Image courtesy of Monte Carlo.
A good data mesh implementation plan can increase velocity by condensing steps and avoiding stories like the one above. Image courtesy of Monte Carlo.

4. Federated Governance

As powerful as the concepts above might be, they’re not without their hurdles: encouraging domain-driven ownership carries a risk of creating data silos, which may not work well together. To foster compatibility, a set of centrally-governed standards is needed.

In the term federated governance, federation is closely related to the word federal as it might be used in “federal state” or “federal government.” Thinking about the relationship between federal and state law can be a helpful analogy here; although the latter can differ considerably, the former aims to bring different entities together under a single umbrella.

A successful federated data governance program lays out various criteria that must be met, based on the needs of the organization as a whole, but still gives individuals within the organization some flexibility as to how they meet those requirements.

This is useful because certain departments might, for example, need to implement security protocols in a very specific way depending on whether the data they use is accessed internally or externally (or both), by APIs, and so on. 

Conclusion

The data mesh is less about imposing a rigorous type of architecture on data management practices than it is an approach. Unlike with a service mesh, which requires an infrastructure layer for inter-service communication, implementing the principles of data mesh is flexible.

In fact, one of the biggest hurdles when it comes to implementing data mesh principles is obtaining organizational buy-in; going from a monolithic data lake that operates behind the scenes for most stakeholders to something that they’re actively involved in is a big change.

However, as we’ve seen above, these principles all offer their own distinct advantages:

  • Domain ownership → Greater control over relevant data
  • Data as a product → Opportunity to make better-informed business decisions
  • Self-serve data platform → Empowers employees to conduct their own analysis
  • Federated governance → Flexibility around how to meet compliance requirements

If you can explain these advantages in a meaningful way, you stand a much better chance of getting organizational buy-in and successfully implementing a data mesh approach!

Did we miss anything? Feel free to leave your thoughts in the comments below.


To find out more about improving your data discovery, and implementing data mesh principles across your organization, schedule some time to talk to us using the form below!

Our promise: we will show you the product.