How to Measure the Impact of Your Data Team
As companies increasingly invest in data and analytics, the need to build a robust and effective data team becomes a top-line priority. We spoke with Jacob Follis, Chief Innovation Officer at Collaborative Imaging, to learn how he sets strategy for his lean but growing data organization, what’s in his stack, and his favorite KPIs to measure data team success.
Hint: it boils down to more than a few dashboards.
As any data professional worth their salt will tell you, you can’t manage what you don’t measure. And the same applies to your data team.
As leaders, we can get laser-focused on our output—from new data products and fancy dashboards to how quickly we can fulfill ad-hoc queries. But that means we often forget to take a step back and reflect on how our teams are performing against the broader goals of the company.
So recently, I sat down with Jacob Follis, Chief Innovation Officer at Collaborative Imaging, a leading radiology software provider. We work with a lot of SaaS companies at Monte Carlo, but the healthcare industry brings a lot of unique challenges to solving for data quality—so we wanted to learn more from Jacob about how he approaches the task of assessing performance and what success looks like for his team.
Making data a priority at a non-SaaS company
Collaborative Imaging (CI) works with over 1,500 doctors to help consolidate and aggregate data around the patient journey through the healthcare system. By making data integration and data sharing available to its partners, the CI team can help reduce costs, foster more collaboration between healthcare professionals, and improve the quality of the patient experience.
But while there are ample opportunities within this legacy space to use data to drive innovation, the company must be extremely diligent when it comes to data privacy and security.
“Being in healthcare, rigor is of the utmost importance,” Jacob said. “If we tell a doctor he did one hundred X-rays when he only did fifty or if we provide even a few incorrect data elements to Medicare, it can be considered fraud. So there are a lot of constraints around regulation.”
Data quality is table stakes—but those stakes are particularly high for Jacob and his team. If patient data is lost or breached, the company could face a $10,000 fine per record. Having clean and proper data hygiene is non-negotiable, and meeting those standards requires careful investment in team-building, measuring success, and technology.
How CI built a data team structured for excellence
The data team at CI is structured around its two primary responsibilities: data engineering and data science.
The data engineering team is responsible for collecting and integrating data from around 1,200 sources into a massive production database. That workload requires data engineers to build and use some homegrown applications to meet the unique demands of integrating data from hospitals—a sector not exactly known for adopting cutting-edge technology.
“We have a lot of HL7, a lot of flat files, CSVs, and those kinds of legacy formats,” said Jacob. “You really don’t get APIs in healthcare.”
Meanwhile, the data science arm of the team focuses on creating value from all that hard-won data: automating data transformation, data visualizations, and creating operational reporting for CI’s internal and external customers.
Both teams report up to the company’s VP of Analytics, who works alongside a data product manager, but their success metrics look quite different. “Data science and visualization are really reliant on having business communication and business acumen so they can understand and deliver impactful reporting,” said Jacob. “And that’s different than, for example, a data warehouse architect who’s really focused on the underlying platform. So that’s what we’ve organized our teams around.”
When it comes to hiring, given that the data team is relatively small, supports a breadth of applications and data sources, and works under high regulatory scrutiny, Jacob has found that looking for a growth mindset and a broad skill set is key.
“We find the muscle memory of learning multiple disciplines, and having the desire for continuous learning, really sets our employees up for success,” said Jacob. “And since the data stack is evolving at a lightning pace, someone who’s adept at picking up new tooling can make a 10x difference in productivity in certain tasks.”
Jacob cites industry experience as an advantage for new employees to get up and running more quickly, but it’s not the end-all-be-all of hiring. Rather, he looks for creativity, innovation, and new ideas that will help drive the team forward and raise the bar.
Tracking the right data team success metrics
When it comes to measuring and improving his team’s success, Jacob takes inspiration from the discipline of DevOps to set specific KPIs. The CI team sets and tracks specific data SLAs, SLOs, and SLIs, which help data leaders and business teams align on what success in data reliability looks like.
“At a high level, we monitor data freshness, data availability of the delivery layer like Tableau and homegrown apps, and some metrics we derive from testing and daily data loads,” Jacob said. “We also use some basic trusted data framework tests during deployments and prior to important end-of-month financial reporting.”
Additionally, individual team members measure personal KPIs like how many points were fulfilled in a sprint, or how many carryover issues they had.
But Jacob also looks to nontraditional metrics to capture and reflect the status of his data team. “When it comes to people, it’s hard sometimes to have very objective measurements,” he said. “So even though we are an analytics team and we speak in terms of numbers and hard-and-fast metrics, some of our KPIs are more subjective and personal—like the concept of the keeper test from Netflix, which asks how hard you would fight to keep your employee if they want to leave.”
With remote work, Jacob and his data leaders talk frequently about trust as well. CI has always had a significantly distributed workforce, but the covid-19 pandemic was a forcing factor to make remote interactions the norm. One of the more subjective metrics the CI team uses is the trust factor: Do I feel the need to review the work of a team member prior to that product or analysis being delivered to our customers? “In order for us to continue scaling, we must constantly focus on enhancing this trust factor,” said Jacob. “We spend significant energy working with our leadership and teams to optimize this new world of remote-first, and making sure our KPIs and metrics are still meaningful.”
Building a data stack centered on quality
Finally, part of Jacob’s mandate as CIO is leading the data engineering team in building and maintaining a data platform that supports machine learning, data science, and analytics while meeting the highest standards of data quality. For CI, that means a warehouse-first architecture built around Snowflake, and overseeing the migration to Snowflake has been a central focus for Jacob since he joined the company.
“Snowflake’s growing aggregation of support for horizontal workloads such as OLTP with Unistore and Python as a first-class citizen with Snowpark enables us to take a warehouse-first, dbt, SQL, and Python approach to data processing and transformation of the legacy data formats we receive and do it all in 1 single, managed platform and workflow,” said Jacob.
Additionally, CI leverages tools like Debezium and dbt for ETL, and uses Tableau to support business intelligence needs, and saves Jacob’s team immense amounts of time through automation. “Our team would probably have to be twice as big to be able to do what we do if we didn’t have things like Snowflake.”
And to ensure their rigorous data quality standards are being met, the team relies on the Monte Carlo platform for monitoring, alerting, and observability. For a three-year-old company, the investment in data observability came early.
“I had some PTSD from a past job,” Jacob said. “I’ve worked in a situation where we had spun up software quickly, and it was way too hard to have the discipline to go back and add in logging, tracing, and data sanity checks.”
With Monte Carlo’s integration with Snowflake, it became easy for the CI team to implement the kind of end-to-end data quality controls that they needed.
“With such high-stakes data in Snowflake, it wasn’t an option to not include data observability—getting that out of the box with Monte Carlo allowed us to avoid spending the mental energy on designing something ourselves so that we could spend that energy on building products that help our customers,” Jacob said. “It brought us down the path of ensuring data quality faster and prevented us from taking on technical debt.”
Monte Carlo’s automated monitoring and alerting helps the CI team prevent instances of data downtime, or moments when data is inaccurate, stale, or otherwise unreliable.
For example, prior to Monte Carlo, the team experienced an issue when one hospital’s technology upgrade led to an error that occurred across every patient message: the guarantor of the patient’s care, an important piece of financial information, was mistakenly listed in every message. That kind of error caused insurance companies to deny claims, prompted impacted patients to contact the CI call center, and created a messy scenario for both customers and the CI support team.
Unfortunately, this sort of issue is far too common in the healthcare world, and most companies don’t catch these kinds of instances for many months or even longer. This is a likely contributing factor to the exorbitant administrative cost in healthcare. “We eventually caught it, but it wasn’t because of our good infrastructure and good tests,” Jacob said. Now, with Monte Carlo’s automated monitoring in place, a similar duplication error would be detected immediately and could be intercepted before customers felt the impact.
With the right talent, metrics, and technology in place, Jacob’s team is set up for success to meet the data needs of their company—and the strict compliance and security rigor of their industry.
Interested in learning how your team can set more comprehensive and actionable KPIs for your data? Reach out to the team at Monte Carlo to learn more.