Data Observability, Case Studies

Eureciclo Improves Data Governance And Reliability With Unity Catalog and Monte Carlo

Michael Segner

Michael writes about data engineering, data quality, and data teams.

Eureciclo is a reverse logistics, waste management services, and recycling company based in Brazil. They rely on internal analytics to make data driven decisions and, as a company in a regulated industry, regularly report to government agencies.

“We take data quality very seriously,” said André Gonzalez, Data Manager, Eureciclo. “If bad data was sent to a regulatory agency, that would create issues for us.”

Democratizing Data Access With Unity Catalog

As part of an initiative to improve the reliability, quality, and governance of their data systems, Eureciclo launched a migration to Databricks

“The transactional systems we were using couldn’t handle modeling or large scale analytics very well,” said André. “Our internal data consumers were catching about two issues a week.”

The team explored solutions, ultimately settling on Databricks for its ability to dramatically reduce the time required to manage Apache Spark clusters and Apache Hive. As Eureciclo’s operations matured, they began to consider a move to Unity Catalog.

Source.

“We started looking at Unity Catalog as part of our project for opening up our data platform for the whole company to access directly,” he said. “We knew governance and managing permissions as people shift in and around the company was going to be challenging and decided to leverage Unity Catalog to help.” 

Overcoming Early Obstacles With Monte Carlo

Unity Catalog enabled André and his team to democratize data access while also managing permission controls. In a way, Unity Catalog was a bit too successful in that regard.

“Before Unity Catalog, we didn’t have granular permission controls. When we migrated, a lot of our service accounts for our data systems lost their permissions and started failing,” he said.

Fortunately, the Eureciclo data team was leveraging Monte Carlo’s data observability platform to detect and accelerate the resolution of data quality issues like this one.

“Monte Carlo started sending data freshness and volume alerts to us in Slack and we used the lineage feature to trace these issues back to the point of origin upstream,” said André. “We discovered this was a permissions issue from the migration and got everything working again.”

Monte Carlo’s alert grouping feature was particularly helpful in this situation where an upstream issue impacted a large number of tables.

Monte Carlo sends related alerts for all anomalies across a dataset to a single thread.

“Our organization and response was better with Monte Carlo,” said André. “Having multiple alerts within the same thread gave everything structure and allowed us to follow and understand what was really going on. Otherwise we were drowning in Airflow alerts.”

Monte Carlo and Databricks: Better Together

Monte Carlo has quickly become an essential component of Eureciclo’s data platform, seamlessly integrating with Databricks, Airflow, PowerBI and other key systems. This has given the Eureciclo team a unified platform for understanding the health of their platform as a whole.

“When we had a pipeline failure our data engineer had to access every single system. He was going into Databricks, going into Airflow…and now suddenly all the information he needs is in one place for him,” said André. “It’s made his post-mortem reports better and made the process much more efficient overall.”

The data team also appreciates being able to not just deploy Databricks governance as code, but Monte Carlo monitors as code as well. It has also helped them detect anomalies across their medallion architecture.

“Like any data team, we experience issues with schema changes where a column or data type will change and create problems in our code. Monte Carlo catches those and helps us resolve them much quicker,” said André.

Monte Carlo’s ability to highlight where the root cause of an issue originated has also helped incident response as the data teams at Eureciclo are structured largely according to layer.

“If the issue occurs in the first layer it’s usually the data engineering team that is responsible whereas if it’s been introduced in the analytics layer then the analytics team will be tasked to resolve it. Before Monte Carlo, we wouldn’t have visibility into exactly where the incident occurred so ownership over its resolution wasn’t as clear,” said André. “Each team will use Monte Carlo to create a JIRA ticket and document the incident so the same mistake isn’t made twice.”

A More Reliable System

André and his team have reduced Eureciclo’s data downtime by more than 80 percent following the adoption of Monte Carlo, Unity Catalog, and creating a better separation between their dev and production environments. 

“There has been a big change. We still have the occasional issue, but now we solve them so fast they don’t reach or impact our data consumers,” said André. “There is a real difference in how they trust our data and work with our team that has made our lives easier.”

The team has also seen their data reliability workflows become much more efficient. 

“Our team is saving time, which allows us to build new features. I’d ballpark it at about 20 percent,” said André.

Thanks to André and his team Eureciclo is continuing its mission of environmental sustainability with world class data reliability.

Our promise: we will show you the product.