Data Mesh Architecture: Roles and Strategies

Preeti Hemant
5 min readMar 27, 2022

This post is a follow up to my previous post on Data architecture models. Here, I will explore strategies and the roles required for a successful implementation of the Data Mesh paradigm.

Data Mesh is not just technology and tools but equally or more importantly — people and culture. Being a socio-technical construct, its success is partly dictated by the roles in it. Apart from the traditional roles (Data Analysts, Data Scientists) in a setup, Data Mesh introduces some new roles. It also repurposes some of the existing ones.

Without further ado, let’s take a look at what these roles are.

Data Product Owner (DPO)

Data Mesh treats data as a product and rightly so — like any product, data is in the service of its customers. It follows that data should be held to the same exacting standards as any other product would be. Just as a product owner is responsible for this in the world of consumables, a DPO does the same in the world of data.

This DPO would be embedded into every domain to bring the rigor of product management and user experience to their data. Someone in this role understands the use cases their domain can serve, turns these into requirements and applies them to the domain data sets. A DPO understands the unbounded nature of data use cases — data can be combined with other data in numerous ways some of which are unforeseen.

Data Product Owner is responsible for providing measurable characteristics that demonstrate the value and impact of the datasets. To quote from Data Mesh — These characteristics talk to how Discoverable, Understandable, Addressable, Secure, Interoperable, Trustworthy, Natively accessible, Valuable on their own, domain datasets are.

Platform engineers

A platform that hosts data and the underlying infrastructure is a data platform. Platform Engineers design and build this infrastructure.

In a data mesh, one of the objectives of the platform is to enable domains to build and share data, autonomously. The platform allows domains to manage their data end to end, usually through self-serve platform APIs. Unlike the decentralized aspects of Data Mesh, like data ownership, the platform is centrally managed. It is domain agnostic — abstracts away provisioning and managing the underlying infrastructure. The platform also automates tasks whereever possible (e.g automating data integrity tests).

Design and implementation of this platform is at the intersection of software engineering and data engineering, a role that is best sutied for the Data engineer (Platform) archetype .

Data Product Developer

A data product developer creates, delivers, maintains and evolves domain-specific data that is modelled as a product.

This individual works closely with a DPO to understand user expectations and ensure inter-operabality with other data products on the mesh. There is close collaboration between data product developers and application developers.

Data product developers own the code, logic and schema of their domain data. They provision the resources through platform APIs.

In the landscape of software roles, this is an Application developer with knowledge of data and eventing systems.

Data Governance committee

After infrastructure, Data governance is the other centralized function in a Data Mesh model. Although centralized,the data governance group is a cross-functional team consisting of platform engineers, product owners and security, legal and compliance experts. Contrast this with traditional governance teams that consist of governance experts but lack representation from different domains.

Data governance team creates policies and standards for how data should be managed and served. It ensures that the data products are secure and trustworthy.

As part of this committee, data product owners define the policy that will govern data products from different domains. Platform engineers automate implementation of the policies e.g access control to sensitive data, encryption or abstraction of PII (Personal Identifiable Information) data. Governance (security, legal and compliance) experts keep track of privacy regulations, security and compliance of data.

Possible Strategies

Organizations in the early stages of data maturity

In these orgs, Data is in the initial stages of being utilized for decision making. Early data set ups have the distinct advantage of nearly zero tech debt. These orgs can benefit immensely from a data platform that is designed, in keeping with the long term vision.

If you are in this stage, platform engineers should be your first hire. In collaboration with the domain teams, platform engineers can start building the various blocks of the data platform.

A Data product owner for multiple domains would be the next ideal hire — to set up common product management practices across domains. While a data mesh model recommends a DPO for each domain, in a “budding” data setup, the first DPO can play the role of a product management architect and work across domains.

Organizations where data is more than a “nice-to-have”

To these orgs, Data is an important consideration, it informs many business/product decisions and the roadmaps. Most times, these organizations have implemented the data fabric model. There is a delineation in services — teams naturally fall into separate domains. The org likely has product managers that work with cross-functional teams and data engineers (who are building data infrastructure). Setting up a data mesh model is relatively easier in these organizations both from a technical and role creation perspective.

The role of a Data product owner can be created by repurposing the Product manager roles. Platform engineer likewise can be a data engineer role that has been redefined to suit the Data mesh context.

Organizations with long established data practices

Here, Data is integral to product and business strategy. These organizations have long left behind the exploratory phase of experimenting with tools and processes. They have figured out what works and are in the phase of making incremental improvements (through data).

Implementing data mesh in these orgs may be a difficult endeavour and in some ways may even be unnecessary. Nevertheless, a role like data product owner can still add value. A DPO could help with incremental improvements in the utility of data for the end users. They could help unify practices and processes across domains. The data governance team could be diversified by the inclusion of domain and product representation.

In conclusion

As you can see, there is no one size fits all when it comes to Data Mesh strategy. The roles, the design and the implementation have to be tailored to the individual scenarios, while keeping the principles of Data Mesh in mind.

Further reading

  1. https://learning.oreilly.com/library/view/data-mesh/9781492092384/

--

--