Data Culture

Updated Aug 04 2020

How to Scale Your Data Team with Confidence

Barr Moses

CEO and Co-founder, Monte Carlo. Proponent of data reliability and action movies.

COVID-19 has forced many companies to tighten their belts and invest in only the most necessary functions — fortunately, data is often one of them. Still, the success of your data team relies on more than their ability to wrangle data and build predictive models.

Over the past decade, data teams have become increasingly important for maintaining a company’s competitive edge, leading to greater innovation and more intelligent decision making. Despite this exponential rise to prominence, however, the notion of data science as applied to industry wasn’t even a thing until the early 2000s. Data engineering, now an indispensable part of many data-driven technology companies, was incorporated into the lexicon even later.

COVID-19 has made data’s impact even more apparent. Not only has data been critical to curbing the spread of the deadly virus, but companies are increasingly relying on data to better understand changing customer trends and make smarter spending decisions.

Although 50 percent of data analytics organizations have not yet had to adjust their staffing and hiring plans in response to economic effects of the pandemic, startups have been forced to lay off tens to thousands of employees, many of them data analysts, scientists, and engineers. As the industry adjusts to this new normal, it’s important to set your data team up for success.

To help you scale your team with confidence, I put together four simple guidelines for turning your squad into a force multiplier for your entire organization:

Define your team’s core 1–2 responsibilities. Prioritize accordingly.

Many data leaders I talk to feel bogged down by the various responsibilities that fall on their shoulders. The Harvard Business Review recently published a report defining the seven distinct jobs of CDOs, from the “Chief Data and Analytics Officer” and “Data Governor” to “Data Entrepreneur” and “Data Ethicist.” It’s hard enough just doing one job well — imagine seven!

Similarly, a 2018 McKinsey study suggests that this disconnect between ROI and data analytics occurs because teams “struggle to move from employing analytics in a few successful use cases to scaling it across the enterprise, embedding it in organizational culture and everyday decision-making.”

Image for post — *By setting clear goals for her team rooted in her CEO’s OKRs, this VP of Data was able to unlock true business value with data science. Image courtesy of* *ThisisEngineeringRAEng* on *Unsplash*.

Amy Smith, Senior Data Scientist at Rebel, an international consulting firm focusing on sustainable and inclusive transportation and a former Senior Data Scientist at Uber, suggests that data leaders dig deep when it comes to gauging how data science can benefit their companies.

“Think about your company’s data needs at a high level,” she said. “What do they care about? What do they need to understand better, and what data will give them those insights?”

Like Amy, I suggest using your company’s top-level priorities to determine how to best leverage your team’s skill set. Stick with 1 to 2 primary roles, as determined by your company’s core objectives. If your company is using Q3 2020 to decide which new products to deploy to your users, perhaps your KPI should be tied to generating more timely analytics on customer behavior; if your CEO wants you to launch your services in the EU, a key responsibility should be mapping compliance work to GDPR requirements.

Once these goals have been determined and signed off on by your stakeholders and CEO, it will be easier to justify your team’s growth and spend as long as you remain flexible to the needs of your business. As we witnessed these past several months, anything and everything can change at the drop of a hat (or should I say, a single data point).

Don’t get hung up on titles.

Given their relative novelty, the terms “data scientist” and “data analyst” can mean any number of things depending on your industry, company, or even team, and it’s important to acknowledge this ambiguity.

In fact, according to Annie Tran, Director of Data Science at Figma, the term “data scientist” as it relates to industry really wasn’t a thing until the late 2000s, when LinkedIn and other big tech companies first started hiring them to better understand user behavior on their platforms. Annie spent the first several years of her career as a data analyst at Willis Towers Watson and later Zynga, before joining Uber as the first data analyst embedded in their product organization.

“When I was hired at Uber, my role was as a data analyst, but so was everyone else’s,” she said. “Some of the data scientists working on our Marketplace team were also data analysts. A couple months in, they changed everyone’s title to data scientist.”

What your data team looks like (and what titles you use) will vary depending on the size of your company and the volume of data you’re using. If you’re a small startup, you may hire a few data generalists, who over time, can start specializing in a different discipline or area. If you’re spinning up a data team at a 500-person advertising company, you may want to start by hiring marketing analytics experts who can hit the ground running with their fancy Marketo dashboards.

My advice: don’t hire for specialist roles until you know what problems you’re truly solving for and how data can be most impactful to your business. Consider bringing in a hungry partial stack generalist with expertise in either data analytics or data engineering and training them to specialize as demands evolve. You don’t want to waste money on something you don’t need only to miss your targets.

Communicate and document data knowledge voraciously.

Several data leaders I’ve spoken to over the years say that their number one hurdle in terms of long-term sustainability of their data pipelines (and the success of their data teams) is the lack of documentation. Too often, teams rely on tribal knowledge and outdated wiki pages to keep tabs on their data, and that’s just not scalable or sustainable.

According to Amy Smith, the best way to ensure that your data team is all on the same page is through knowledge sharing, early and often.

“A lot of a data scientist’s early success is through joining a team that is willing to take the time necessary to write down their knowledge,” she said. “Putting the collective knowledge of a team into a form that someone new can read and get up to speed on is hugely important.”

More specifically, lack of robust information about data and metadata is a major pain point for teams, but it’s something that can be addressed. Some solutions that make these insights easier to access are:

Data catalogs: Smaller teams (2–5 people) may get by with an Excel spreadsheet, but as your data stack matures, consider investing in an in-house, third-party, or even open source solution.
Database management system (DBMS): A DBMS is a software application or package designed to manage data in a database, including the data’s format, field names, record structure, and file structure. While this won’t replace a data catalog in terms of providing context, it will help you keep your data organized for easy access.
Data modeling tools: Data modeling tools give teams the ability to discover and visualize data assets. These products can also help teams understand the relationship between various elements of your data stack.
Data observability solution: Your data knowledge only matters if your data can be trusted. Data observability solutions solve many of the same issues as data catalogs, DBMSs, and data modeling tools, but draws on a newer approach to data management which is based on software engineering best practices. It refers to an organization’s ability to fully understand the health of the data in their system, thereby eliminating data downtime by applying best practices of DevOps Observability to data pipelines.

In addition, teams who want to take knowledge transfer and accessibility a step further can make a point to build out their data operations with missing information and other context. To this end, data leaders should encourage their analysts to add missing dimensions to data when noticed, not only when required. Just because you’re not using it now doesn’t mean you or a colleague won’t use it later.

Drive adoption of data for all data users at your company.

Annie Tran notes that many companies, like Figma, have taken a data-first approach to building out their culture, making their data solutions available to everyone at the company and encouraging employees across all functions to explore relevant dashboards.

“At our weekly All Hands meeting, there’s time dedicated to reviewing metrics,” she said. “Data is something that’s prevalent in our culture and that everyone has the ability to see and take in. This openness reinforces the importance of data to our operations.”

However, your company’s data solutions are only as useful if you can trust your data. In addition to promoting a data-forward culture, making data accessible, trustworthy, and self-serve is critical to getting newer team members more comfortable with exploring your company’s data ecosystem. In fact, when data is reliable, automation of traditional data science methods (statistical analysis, A/B tests, and model training, to name a few) can unlock new levels of productivity, speed, accessibility, and accuracy that were otherwise unachievable.

According to Eli Brumbaugh, a data design leader at Airbnb and co-creator of the company’s Dataportaldata catalog, one of the most important things when it comes to setting data users up for success is ensuring that their data is accurate and relevant.

“Facilitating user confidence that they have found the right information or data to act on is paramount,” he said. “We needed to ensure that the data driving our business decisions could be trusted, and that we knew whether or not that data can be applied to a given use case.”

There is a spectrum of approaches to tackling data management solutions — including data catalogs, data reliability, compliance, security, and more — so it’s important to choose what makes the most sense for your business. Large technology companies like Airbnb, Uber, and Netflix can afford to spend budget and engineering resources building in-house solutions to tackle these problems. On the other side of the spectrum, ad hoc quality checks and Excel sheets trackers are easy to build, but lack the comprehensiveness necessary to truly make an impact.

Scaling a data team is simultaneously fun and challenging. Hope these tips not only help you build confidence in your journey, but knock it out of the park!

Interested in learning more about how to scale your data team? Book a time to speak with us using the form below.

Our promise: we will show you the product.

Related resources

2023: The state of data quality

Did you know that bad data impacts 31% of a company’s revenue? And that 74% of data engineers say data quality issues are surfaced first by stakeholders? These stats and more in our recent survey with Wakefield Research.

Learn more

3 simple steps for Snowflake cost optimization without getting too crazy.

Snowflake cost optimization efforts need to be right sized. Learn how to get the most savings with investing too much of your team’s time.

Learn more

Data testing vs. data quality monitoring vs. data observability: What's right for your team?

In the fight against bad data and broken pipelines, there are a few popular options. But what makes the most sense for your data quality needs? We’ve got the answers.

Learn more

How to Scale Your Data Team with Confidence

Define your team’s core 1–2 responsibilities. Prioritize accordingly.

Don’t get hung up on titles.

Communicate and document data knowledge voraciously.

Drive adoption of data for all data users at your company.

Related resources

2023: The state of data quality

3 simple steps for Snowflake cost optimization without getting too crazy.

Data testing vs. data quality monitoring vs. data observability: What's right for your team?

6 Tips For Better SQL Query Optimization

Measuring Data Quality: Key Metrics, Processes, and Best Practices

The Cost of Bad Data

Define your team’s core 1–2 responsibilities. Prioritize accordingly.

Don’t get hung up on titles.

Communicate and document data knowledge voraciously.

Drive adoption of data for all data users at your company.

Read more posts.

Vanquish Toil: 9 Data Engineering Processes Ripe For Automation

Experimentation: How Data Leaders Can Generate Crystal Clear ROI

ETL vs ELT: What’s the Difference (and Which is Better)?

What is Data Downtime?

On Data Governance: Maria Villar, Head of Enterprise Data Strategy and Transformation

Who Is Responsible For Data Quality? 5 Different Answers From Real Data Teams

Related resources

2023: The state of data quality

3 simple steps for Snowflake cost optimization without getting too crazy.

Data testing vs. data quality monitoring vs. data observability: What's right for your team?