How to Build a Culture of Data Trust: A Conversation with Hilary Mason
As more companies invest in more data tools, initiatives, and teams, the appetite to become a “data-driven organization” continues to grow. But if stakeholders, consumers, and leaders across the company don’t trust that the data flowing through your pipelines and populating your products is useful and reliable, all that investment is for naught.
So how can a team build a culture of data trust—especially within a complex environment?
To gain an expert’s perspective, we recently sat down with Hilary Mason—a data leader with an impressive and diverse range of experiences when it comes to building wildly impactful data organizations. Hilary has served as Chief Scientist at bitly and GM of Machine Learning at Cloudera, founded the ML research company Fast Forward Labs, and is the current Data Scientist in Residence at Accel. She’s had the opportunity to advise businesses across industries, from scrappy startups to Fortune 500 enterprises, on data strategy.
And Hilary doesn’t shy away from sharing controversial takes on how companies are using—or misusing—data within their organizations. We loved learning from her straightforward talk rooted in deep experience, so read on for her top takeaways about how forward-thinking teams are building more trustworthy data cultures across three pillars: people, process, and technology.
But first: defining data culture
Hilary defines culture as “what people do when they aren’t being told what to do”. Company culture is made up of expectations, habits, norms, and practices. For example, you can get a glimpse of a company’s meeting culture based on a) whether people tend to show up on time or five minutes late; b) if they do other work on their laptops or pay rapt attention; and c) how (or if) they make small talk or dive right into business.
Data culture works in a similar fashion: how norms and expectations are applied to data. And in Hilary’s experience, building an organization-wide data culture is the foundation for building data excellence.
“Data culture isn’t just about one team building a product that’s consumed by everyone else,” says Hilary. “It’s about changing and evolving how everyone, across teams, thinks about using data and making decisions. And a strong data culture is a significant accelerant for success.”
What does a good data culture look like? Hilary points to a few common themes: making consistently data-informed decisions, using data to find new opportunities for growth, and widespread knowledge sharing of data artifacts across the organization—so people are making decisions based on the same data sets.
And, according to Hilary, building this kind of positive, trusting data culture all starts with your people.
Pillar #1: People are essential to building data trust
Setting your organization up for success begins with assembling the right data team. And according to Hilary, you don’t need every member to have a Ph.D. in statistics. To hire the right mix of data engineers, analysts, data scientists, and data product managers, you should think about building the right mix of expertise across the entire team.
How to hire a quality data team
When she thinks about finding a qualified data hire, Hilary considers their soft and hard skill sets across a few parameters:
- Math and statistics: They should have enough fluency to model things well
- Technical skills: They should be able to get to where the data is and figure things out with it, whether that’s SQL, Excel, or Python
- Communication: They should be able to sit down with a business decision-maker, understand the problem they’re trying to solve, go away and do the analysis, and come back to convey their findings in a way that helps the decision-maker actually make a decision
With these core competencies in place, Hilary believes you can build a stellar data team. And this approach also means that recruiters can open their searches to people coming out of academia, operations research, or other disciplines—not restricting applicants to direct data science or analytics experience alone.
Don’t forget, Hilary says, to expand your focus beyond highly specialized individual contributors alone. She points to data product managers, designers, and leaders as the translators and bridges across data disciplines and business units as the unsung heroes of data teams.
How to structure your data team
And what about the (relatively) age-old question: to centralize or not to centralize your data team?
It depends, says Hilary. While a centralized data team has its advantages—data professionals can work together, bond together, and share their skills with one another—it can introduce friction for other teams.
Alternately, you can have data specialists scattered throughout the organization, but then you run the risk of replicating work, using inconsistent processes, and juggling different tooling or vendor preferences.
“Whether folks sit with the team or servicing or sit with a centralized team, it all comes down to where you want to introduce friction and where you want to ease that friction,” Hilary says. “And it often changes over time. I used to think that if a company has to change the structure of its data team every two to three years, they were doing something wrong, but now I don’t think so. Evolving over time isn’t wrong.”
Expecting data fluency across the organization
Additionally, building a data culture also means improving data literacy for everyone in the organization, not just data teams. Hilary believes that data fluency will soon be table stakes for all employees in the modern workplace. “In the same way we now expect that every professional is fluent in email, we should expect that everyone becomes fluent in how to use data. They should know how to look at a graph and draw their own conclusions, and how to ask good questions about data.”
With the right people in the right places (at least for the current moment), organizations can turn to creating the right processes.
Pillar #2: Put the right data processes in place (spoiler alert: agile isn’t the answer)
Process, according to Hilary, is simply etiquette—the rules people should follow to get to a good outcome.
For example, treating data as a product is an approach that works in many ways. Applying standards of rigor like data SLAs, hiring data product managers, considering scalability, and investing in self-service tooling are all worthwhile ventures that can build trust and improve data-driven decision-making across an organization.
But, Hilary cautions, companies shouldn’t go too far in modeling their data processes after software development processes. “This may be my most controversial statement of the day: agile software development is where data and data products go to die.”
That’s because data is an inherently iterative process of asking questions, learning something, asking another question, and trying to reach an understanding that can be used to make a decision or build a product. Whereas software engineering, Hilary says, is typically a process that starts with the knowledge of what you’re doing or what you’re going to build, followed by iterating your way there through development.
“It’s unbalanced,” Hilary says. “And software development doesn’t have the space for the exploration that data teams need to happen.”
Instead, teams should start with a lightweight process that guides data teams through answering high-level questions. Hilary’s recommended MVP data process includes the following:
- What problem are we trying to solve?
State your problem in plain language so that anyone who reads it can understand.
- How do we know when we’ve won?
Describe the quantitative metrics you’ll use to understand success and define what done looks like. This mitigates the risk of going down a rabbit hole by trying to address a question with data past the point where it’s worth doing so.
- What’s the first thing we’ll do with this?
Define the business justification for this work, and how it will be used. Ideally, you can build upon the first use case with additional ways to leverage the analysis or data product down the road.
- What are the downstream considerations for productionalizing this?
Understand what the final form will be, who will be responsible for taking it as a handoff, and what their concerns may be about maintaining it. This helps prevent instances of building and handing off a prototype, only to have it die.
Pillar #3: Invest in the right tech for the jobs at hand
There are always new tools emerging that can distract teams with shiny features and lofty promises. But there are also legitimately exciting new technologies that teams should pay attention to and take advantage of sooner rather than later.
The key to knowing what tech to invest in, says Hilary, is understanding what signals to look for that indicate meaningful impact. These include:
- Areas where active research is happening (like solutions that help interpret and explain black-box models)
- Areas seeing an active change in economics (like tools that make big data or ML cheaper, more accessible, or more reliable)
- Areas where new data has become publicly available, or synthetic data becomes a possibility
These guideposts will help teams steer away from the fanciest new tools and towards the technologies that have a good chance of making a meaningful impact on the business.
Takeaway: Learn how to prioritize data projects that will be trusted
If you want to build a culture of data trust, you need to provide trustworthy and useful data to your organization. And that’s easier said than done—because by nature, the iterative and exploratory process of building data products means not everything will be a runaway success.
Based on her experience consulting with companies large and small, Hilary has developed some frameworks that teams can use to evaluate and prioritize data projects that are most likely to be trusted and used within an organization.
Second guess the two-dimensional scatterplot
While Hilary says every company she works with tends to have a scatterplot that stacks up the projected value of a data project on one dimension with the cost to implement on the other—but that’s not usually the best way to go forward.
“It can make sense to invest in the projects that have a lot of value and don’t cost much to implement,” Hilary says. “But when you dig into them, the values aren’t consistently or accurately estimated. And you usually have to do exploratory work to understand if it’s even possible to assess the question you’re proposing to answer.”
Invest in a portfolio of data projects that build on one another
Rather, Hilary advises companies to think of the data products they could invest in as a portfolio. In this model, you’d expect some to work better than you predict, and others to completely fail. (Just make sure you’re prepared to assess and bail out of the failures as quickly as possible.)
If you do this well, you can stand on the shoulders of the products that do succeed in future projects. By building on your learnings, you’re gaining the most possible value from your data explorations.
“From a culture point of view, this is about having some way of understanding the portfolio of data projects happening across an organization so that progress on one can create progress on others,” Hilary said, “rather than having every little bit of the organization doing their own thing naively, without even knowing what other people are doing. And that’s the key to getting this right in a complex organization.”
Accurately estimate the cost of implementation
This portfolio-building approach also involves understanding the feasibility and cost to implement a new project. Hilary advises teams to consider the following:
- What data assets are available?
- What’s the product opportunity here—what might we do with it?
- What are our technical capabilities?
- What’s the possible impact?
- What are the downstream considerations—who’s going to own it, how does it get maintained, and how does it grow in the future?
- Could a human accomplish this work? And could a group of humans do it reliably, with inter-rater reliability?
- Is there a feedback loop here? How will we know if we were right, and how will we measure improvement over time?
If you’re new to an organization, Hilary says, one way to identify good opportunities for impactful data projects is to think about how you can use data to support an existing process. Find something expensive or resource-heavy that could easily be data-informed and start with that low-hanging fruit. If you start in an area where people are already doing work and you make it cheaper and easier, then you make them into champions and you win over skeptics. Then, invest in scaling that work to create new applications and products once you have buy-in from other teams.
Takeaway: Democratize data—even though it will go wrong
Finally, Hilary believes that democratizing data is essential to building a culture of data trust. Even though it won’t go as planned.
“People will take the data and do bad statistics and draw erroneous conclusions and use it to justify the thing they were going to do anyway,” says Hilary. “But it’s worse than NOT democratizing the data because the alternative is that people will not use data.”
This is because too much friction prevents people from using data. “If they have to call someone else and get them to do work on their behalf, it creates so much friction that you might as well not have the data in the first place.” It’s better, Hilary says, to allow people to make mistakes and have support than it is to try and hoard the data in one centralized place.
And remember, Hilary says, that even if tech and tooling are evolving quickly, the way people think about working together won’t shift overnight. Changing your culture takes time—but it’s well worth the effort.
Thanks to Hilary for sharing her hard-won expertise with us! Don’t miss out on future talks with data industry leaders: follow us on LinkedIn to stay up-to-date on our webinars, events, and conversations.