7 Steps for Building a Successful Data Team at Your Startup

When you’re the first data hire at a startup, the sky’s the limit—and that can be incredibly overwhelming.

Who do you hire first? What tools should you invest in? What KPIs should you measure? And much more.

No matter how you cut it, you don’t have an instruction manual, and given how fast the data landscape is evolving, it’s hard to find (let alone follow) best practices for building a data team from scratch. 

Over the past few decades, I’ve built or led data teams at enterprise organizations and startups across industries, from healthcare and cutting-edge SaaS, to geospatial analytics, and back again. The business needs, budget, and tech stack may vary from company to company, but one thing remains the same: if you don’t have the right tools, processes, and people in place, your team can’t succeed.

At the end of the day, our job is to make data as accessible and easy-to-understand as possible. It doesn’t matter how much we collect or how “modern” our tech stack is if we can’t operationalize it. 

Here are seven critical recommendations on how to start building a successful data function from day one: 

Step 1: Assess your company’s infrastructure and analytics intelligence

When you join a new company as the first data hire, start by assessing two things: the infrastructure and the analytics intelligence.

When looking at early data platform infrastructure, you’re trying to determine the logical separation between your operational database and your analytics database (if you have one). 

Ask questions like:

  • What can data support with the current setup? 
  • Do you just have an operational database?
  • Is data flowing somewhere that you can manage it from an analytics perspective?

If you find that your new organization doesn’t have that kind of infrastructure in place, you want to start by collecting data and making sure that it is instrumented correctly to support your use cases. 

Your first initial investment of time is to stand up a stack that will enable you to start collecting data that can then be used for analytics and reporting. After all, you can go back and report but you cannot go back and collect..

When assessing analytics intelligence, I rank companies on a scale of one to 10. A company ranked at one doesn’t really have much knowledge of analytics, while a 10 knows everything—they know behavioral analytics, financial analytics, and people think in three-dimensional analytics as opposed to flat analytics or counts. Because in flat analytics, one plus one equals two, but in behavioral analytics, one plus one might equal something else.

Understanding infrastructure and analytics intelligence really defines how quickly you can move on the key initiatives you want to drive, and defines the process in which you can move. Often, it takes a week to ten days, maybe two weeks tops to get this process started. The goal should always be to drive business value quickly.

Step 2. Start shaping stakeholder expectations

When an organization isn’t used to working with a data team, it can be an uphill battle to get stakeholders to see you as more than just a provider of lists or a fielder of ad-hoc queries. Here are a few strategies for how to set those expectations and get business leaders to become better data partners.

Start by fielding ad-hoc queries

I’m the first to admit that sometimes you can’t get around fielding ad-hoc queries, especially when you’re just building out a data team. 

For the first six months, when I start hiring people, I try to be upfront with everyone interviewing for my team: as a data analyst or engineer, you have to be someone who the rest of the business can go to to solve problems. But in the short-term, because every startup is so new in terms of data, you’re going to have to spin up more lists and field more ad-hoc queries than you would like. And that’s OK – as the company matures and analytics becomes more self-serve, it won’t always be that way.

Always give them more

I also give my team a mandate: Don’t ever give stakeholders what they ask for. Always give them more. 

When I am building a data team, I tell my team that when they receive an ad-hoc request, don’t just give your stakeholders what they ask for—find out why they need the data, give them what they asked for, and give them one additional piece of information that you think could solve their problem better. Over time they will see you as more than just a giver of lists.

Overwhelm your stakeholders with data

I also use what I call “the overwhelm method” to train stakeholders to only ask questions that matter. In short, I will overwhelm teams with so much data. It lowers the volume and increases the quality of questions, which gives my team more time to build core functionality.

This forces them to think strategically about what they actually need so that my team doesn’t have to boil the ocean and spend time spinning up a report that won’t move the needle.

Become more than a list-giver

As previously mentioned, stakeholders often first view analytics teams as list-givers. And that’s OK – as long as this definition of what “analytics” means evolves over time.

As you develop trust and a relationship with them, people will start to ask about the problem they’re trying to solve and trust that you’re going to give them what they need to solve it. Actions speak louder than words, and analytics is no exception. Pulling 10 great reports is an infinitely stronger way to build cross-functional trust than sharing a thoughtful strategy document that barely sees the light of day.

In my eyes, data teams have a clear purpose. We are not here to answer questions—we give people data to help them ask better questions, because data never stops, and the second you think you have the answer, you’re wrong. At the end of the day, the goal of a data team is to empower the stakeholder  to make faster, better decisions.

Step 3. Build out your early infrastructure

Next, I encourage team leaders to think about how they’re going to build out their existing tech stack to better support the needs of the analytics and data engineering teams. 

When it comes to acquiring tools and building up early infrastructure, I recommend four key steps to make it possible for your team to start doing valuable work. 

1. Instrument & collect data in an analytics database

The first step is to get all the data from your operational platform into a data warehouse, like Snowflake. It is also good to make sure that the systems are set up to capture consistent data in the right formats. 

2. Map your databases and create usable datasets

Once you have your data warehouse built out, you can really quickly start to map out a lot of those databases so you know how to take the data and map it to a dataset that can be used. If you have to join six tables to get to a date, it’s way too operational, and your ability to scale quickly is reduced. .

3. Use automation to make data accessible

Next, you implement an automation strategy through dbt or some other transformation tool that takes those scripts you wrote and puts them in a nightly run. All of a sudden, you have these nice little sets of data that you can quickly query and get data out – in other words, you’re generating data that’s accessible.

4. Work towards business intelligence while managing ad-hoc reports

Once your data is in those analytics databases, there’s a number of things you can put into place, but eventually you want to get all the way from ingestion to business intelligence, a process that can take several months to multiple years. 

To expedite this process, I suggest using a workflow automation solution, like make.com, that ramps up your ability to deliver data quickly while scaling out larger pipelines. It basically lets analysts be data engineers by taking a script through Snowflake and piping it through a workflow manager that kicks data out to Slack or email or Google Drive, or wherever it needs to go. This makes it so that you don’t have to stop and write ad-hoc reports constantly, which will allow you halve the amount of time it takes to achieve actual business intelligence.

Step 4. Develop privacy and security protocols

As the first data hire, you’ll need to keep privacy and security in mind as you manage all these other priorities. Start by segmenting data access and setting up additional processes for sensitive data. 

In my current role, my team uses Google Drive and Slack channels to segment data by team. If you’re on the marketing team, you have access to the marketing Slack channel where reports are delivered, and you can access the data delivered into the marketing Google Drive. If you’re on the finance team, you have access to the finance channel and drive.

For personally identifiable information or HIPAA-sensitive information, additional precautions are needed. In fact, I try to put as little PII into reports as possible. If an employee needs HIPAA-related information, they have to request it via a JIRA request and provide the reason they need the data, ensuring data accountability and compliance with future audits. 

Another way to prioritize privacy is to have your data team also follow up manually for sensitive requests. At past companies, I actually had my analysts and engineers reach out to the stakeholders and leaders of the groups requesting this data and say, ‘Hey, somebody asked for bank account numbers. Do they need this?’ This way, you can take an extra precaution when it comes to sharing potentially sensitive or off-limits data with those who don’t need access.

Step 5. Hire the right people at the right time

After you have your building blocks in place – tech stack, stakeholder trust and alignment, and privacy and security protocols – it’s time to start building out your early team. Most data teams will start with an analyst or two, and a data engineer, who frequently reports to a CTO, and later, a Head of Data – perhaps you. 

After you have folks to get the gears in motion, it’s time to think a little more strategically.  I use a somewhat unconventional approach to figure it out: a two to three-year data plan that  maps out what the org chart will look like as the company grows. Of course, it’s important to keep in mind that all plans are subject to change, but without a plan to chart the path forward, it’s difficult to scale.

For instance, let’s say you were at a billion-dollar startup – lucky you! Hypothetically, let’s say you had 6 functional areas, each requiring analytics support. My general rule of thumb is to implement at least a six-to-one ratio of six analysts for every group, plus engineering to build out the infrastructure. So you can draw out that 36-person org chart, laying out directors, the VP, and all the individual pieces.

You can work backward from there, mapping out how the org chart will look like at two years, 18 months, 12 months, and six months. Then, it’s easier to determine who your first few hires are to reach those milestones. As a reference point, at my current company, I chose to take on the analytics work myself so I could put engineers in place before hiring analysts. Still, it all depends on the individual organization, their current skill sets, and where budgets are being prioritized. 

Still, be careful about hiring specific roles before your infrastructure is ready to support them.

For instance, if you hire a data scientist but your company isn’t heavily invested in A/B testing, predictive models, or experimentation yet, they won’t have the organizational buy-in or infrastructure in place to do their work and they would potentially be bored. 

6. Set the right KPIs

As a first data hire, you may be expected (or even pressured) to set KPIs and start measuring performance while you’re still in the midst of building out your infrastructure, but this urgency to OKR can be more trouble than it’s worth.  In fact, setting KPIs too early into the maturity of your data organization can have detrimental effects on the business. Here are a few tips for avoiding disaster when it comes to setting data KPIs: 

Learn how to write KPIs properly

The biggest mistake you can make is getting into a key performance indicator (KPI) strategy without understanding what a KPI is and how they need to be written. But when done right, they are useful and beneficial.

For instance, a KPI is not something that is easy to manipulate, which can lead teams to abuse them to inflate performance. A KPI is also not a goal, but instead, a way to measure performance and progress. The more you treat your KPI like a gauge and less like a line in the sand, the healthier and more productive your organization will be. 

Make sure your KPIs are measurable

Most KPIs I’ve seen at companies are non-measurable, which for an organization is irrelevant. Most of our KPIs at Shiftkey revolve around revenue, like bill rates, or movement metrics like a person moving from one state to another (i.e. newly registered to actively working). These could be movement metrics or growth metrics that are percentages, but it has to be something you could tangibly query in the system and get a result.

Tip 3: Automate your progress checks

You have to be able to write a query that automatically checks on your progress, because if you’re putting that task in the hands of the people who are doing the work, they just won’t do it. It’s just human nature! They’re too busy for that. 95% of our KPIs tracked in Ally.io are fully automated. .

Remember: know what you don’t know

My final tip for first data hires? Remember that every company and every role will be different. You’ll always have to tweak your plan to fit the company first. 

No matter what company you go to, it’s always new. If you come into this thinking you’ve got it all down—you don’t. You just won’t be successful. 

The good news? Data is a rapidly evolving field with new blueprints for success and team strategy being written every day by pioneering voices. I’m encouraged – and excited – by the wealth of content out there produced by not just the FAANGs (or, er, MAANGs) of the world, but also startups and legacy industries, too! 

Want more expert advice on building a data team from scratch? Here are some additional resources to check out: 

Did I forget something? Connect with me on LinkedIn.

Interested in learning more about the Monte Carlo data observability platform? Fill out the form below to schedule time with us.