How The Farmer’s Dog Builds Data Reliability with Monte Carlo
Companies across all industries are striving to become data-driven: making decisions based on data and building a culture of data trust and transparency. But data downtime—periods of time where data is missing, broken or otherwise erroneous—undermines those efforts and can cost companies upwards of $15 million annually.
Learn how The Farmer’s Dog, a fresh dog food company, uses data to make better decisions and Monte Carlo to ensure trust in their data.
Being data-driven may initially seem at odds with the world of pet food and puppy love, but New York-based startup The Farmer’s Dog is showcasing how the intentional, intelligent use of data can enable the delivery of not only fresh, healthy food for dogs, but also a memorable, personalized customer experience.
The company began with its two founders making dog food out of their kitchens, but as the company grew, they began to incorporate data into their DNA. At every step, teams are researching, analyzing, and optimizing the food they make and the customer experience they provide. The Farmer’s Dog’s mission is to help more dogs enjoy longer, healthier lives by replacing conventional highly-processed pet “food” with fresh, human-grade options.
As Rick Saporta, Head of Data Strategy and Insights, said: “We use data in every corner of our business. Our team’s mandate is to ensure that every decision at The Farmer’s Dog is made with optimal insights and minimal friction.”
The Challenge: Fast growth into a complex ecosystem leads to unpredictable data issues
In 2019, The Farmer’s Dog Data Strategy & Insights team was quite lean. “With this lean team, we built a whole lot of business intelligence, crucial reporting, and derived many valuable insights,” said Rick. They worked to bring in new data sources, stand up new environments, and productionalize data science efforts. As they grew, Rick’s team began to add a new data source every sprint—sometimes several at a time.
At the same time, the Farmer’s Dog tech stack was growing more complex, with a multi-cloud system that encompasses AWS for the company’s principle ecosystem and website, several Postgres databases, Google Cloud Platform for the Data Strategy & Insights team, BigQuery, ETL tools, Looker, and data sources like Segment and Kustomer. Also, the data engineering team at The Farmer’s Dog is downstream from several distinct engineering teams overseeing the website, the production database, APIs, and operations.
As the team’s data platform evolved and expanded at a faster rate, data challenges increased in tandem, too.
“If you own data pipelines, you’re extremely familiar with this problem,” said Rick. “And it’s not a problem most people look forward to tackling.”
“When data that is used everyday breaks, you know instantly because someone always has eyes on it,” Rick said. “But then you have that other data – the kind that is critically important when used, but that is not looked at every day. Because of its nature, when it has the potential to silently break, unnoticed until right before the most critical of meetings.”
These issues would pull the data engineering team away from their planned work and into a reactive mode of searching for the source of an outage, troubleshooting the problems, and implementing a solution. “When data breaks, we have to stop making new things and address the issue,” Rick said. “And data pipelines break all the time. Not having visibility into the various pipelines’ health status can end up pulling the team’s focus away from their proactive, planned work and into a reactive cycle of chasing bugs and fixing outages.”
The Solution: data observability with Monte Carlo
Rick and his team were working on mapping out all the ways their data could break, so that the DataOps teams could set up alerts and monitoring around the pipelines—when an old friend applying for a job at a small startup named Monte Carlo reached out to see if Rick would provide a reference. He agreed, and ended up chatting with our co-founder and CEO Barr Moses.
“Our original call was a business reference call. Being curious, I was asking Barr what exactly Monte Carlo’s offering was. I’m not exaggerating when I say the meeting right before my call with Barr was a 2-hour planning session to address the exact problem Monte Carlo solves. In that planning session we had identified that it would take us 3 – 6 months to get our ‘Priority-0 and Priority-1’ monitoring in place, and had chunked it down to our first two sprints-worth, which would get us something useful, but still very far from what we really needed.”
She told him about the Monte Carlo platform, which solved the exact problem Rick and his team were working on by addressing broken data and pipelines. Monte Carlo provided automated monitoring and alerting, as well as field-level lineage, delivering end-to-end data observability.
“I’m living and breathing this problem everyday, and here I am on the phone with this new startup whom I had never heard of until a week ago. Truthfully, it seemed too serendipitous and too good to be true. But if the promise was real, it felt like it would be a game changer for our team. Back then, so much of our time was pulled away from the things we truly wanted to build. So I begged Barr to let us try Monte Carlo. Inside I was laughing at the irony – here I am begging a vendor to sell us their product. At that point she was saying no, because the platform wasn’t really ready for customers yet, but I think I won her over with the puppies.”
The Farmer’s Dog did join Monte Carlo as a customer, and Rick still marvels at how quickly his team was up and running with data observability—”It was just two meetings, one for an hour, one for half an hour,” and then they were up and running.
In just a few days, Rick and his team began receiving notifications about data issues.
“I wasn’t even expecting notifications yet,” Rick said. “I thought there would be another phase of work where we would have to set them up. I thought, ‘Okay, we got it configured, I’ll find some free time next week and I’ll actually start setting it up so we’ll get these notifications.’ And then boom, I just get one in my inbox. Since then, we’ve gotten notifications for all types of different anomalies that I would not have thought to check. I keep thinking back at our original ‘6 month plan’ and how so many of the alerts we have since gotten from Monte Carlo weren’t even in our original plan.”
Outcome: ability to detect unknown unknowns in their data
UTM parameters are just one example of how Monte Carlo has helped The Farmer’s Dog detect data issues. Since several different engineering groups sit upstream from the Data Strategy & Insights team, changes made by one group can have unforeseen consequences on downstream data health.
“Many ETL pipelines are, at their heart, a communication between two different teams, often at different companies,” Rick said. “When one team makes a change, it affects the other. In the best of instances, you might have strong communication between the different teams, but all pipelines break at some point, and it’s hard to anticipate all the different ways that data can break.”
He describes these “unknown unknowns” data issues as the Anna Karenina principle in data form: “All good data is the same, but each bad data is bad in its own way. Our goal is to keep our data pipelines as healthy as the dogs we feed.”
By monitoring for data downtime, alerting relevant teams when anomalies are detected, and providing lineage into the upstream and downstream dependencies, Monte Carlo helps Rick and his team be the first to know when something breaks—and how to fix it quickly.
“Monte Carlo has been exceptional at catching upstream errors,” Rick said. “And being able to tell you that something is amiss and guide your focus with such precision! Pointing not only to the specific table but the lineage that quickly gets you to the root of the error. It’s just absolutely incredible.”
Outcome: self-serve data troubleshooting
One of the unexpected benefits of Monte Carlo, according to Rick, is the communication it enables among other teams.
The Farmer’s Dog has Monte Carlo integrated within the company Slack, and he’s noticing Engineers and Product Managers outside of DataOps using the channel to monitor their upstream work.
“Our general philosophy is to empower people as much as possible with access to data and information,” said Rick. “So if there’s some notion that this monitoring tool is helpful for your work, here you go, let’s you get access to it. It’s in an open channel that anyone can just hop into and see the notifications, or login to the Monte Carlo UI if they want to go deeper.”
We set up single-sign on access, empowering anyone with a company email to access the platform and monitor the data flowing through their work.
Outcome: building data trust
Rick repeats the maxim again and again: The Farmer’s Dog isn’t a tech company—it’s a pet health company that uses technology to create and deliver an entirely new category of pet food. But investing in data observability and democratizing access to data tooling has helped The Farmer’s Dog achieve something many tech companies continue to struggle with: building trust in data.
With monitoring, alerting, and lineage in place, Rick and his team can proactively communicate data downtime to their colleagues across the organization who may be impacted. “I can let my key stakeholders know ‘These reports are not going to be available until we fix this’,” Rick said. “But what I’ve found is that we haven’t even had to go down that path often because we’re able to fix problems so quickly, often spotting them when they are tiny quick-fixes before they grow into bigger outages.”
Ultimately, this increased transparency and focus on data quality helps The Farmer’s Dog accomplish their larger mission. “We spent a lot of time keeping the data healthy, but the data informs the analysis and the insights. And the insights inform the business and the feedings of the dogs—and really, it’s all about keeping the dogs healthy.”