Data Quality Fundamentals

Free O’Reilly Book

Data Quality Fundamentals

A Practitioner’s Guide to Building More Trustworthy Data Pipelines

Claim your early release copy (a $67 value)

Do your product dashboards look funky? Are your quarterly reports way off? Are you sick and tired of running a SQL query only to discover that the dataset you’re using is broken or just plain wrong? These errors are highly costly and affect almost every team, yet they’re typically only addressed on an ad hoc basis and in a reactive manner. If you answered yes to any of these questions, then this book is for you. 

Many data engineering teams today face the “good pipelines, bad data” problem. It doesn’t matter how advanced your data infrastructure is if the data you are piping is bad.

In this book, Monte Carlo’s Barr Moses, Lior Gavish, and Molly Vorwerck, creators of the Data Observability category, explain how data teams can:

  • Build more trustworthy and reliable data pipelines
  • Write scripts to make data checks and identify broken pipelines with data observability
  • Program your own data quality monitors from scratch
  • Develop and lead data quality initiatives at your company
  • Generate a dashboard to highlight your company’s key data assets
  • Automate data lineage graphs across your data ecosystem
  • Build anomaly detectors for your critical data assets

And much more! We’re thrilled to share the first few chapters of this book with you for FREE. Enjoy!

Download the First Chapters for Free

Meet The Authors

Barr Moses

CEO and Co-founder of Monte Carlo

Barr Moses is the CEO and co-founder of Monte Carlo, a data reliability company. Barr has worked with hundreds of data teams struggling with these problems. Inspired by her time in the analytics trenches, she is building a product literally dedicated to identifying, resolving, and preventing what she calls “data downtime,” periods of time when data is missing, erroneous, or otherwise inaccurate. In other words: bad data. In this book, she shares her experiences and learnings on how today’s data organizations can achieve high data quality at scale through technological, organization, and cultural best practices.

Molly Vorwerck

Head of Content at Monte Carlo

Molly Vorwerck is the Head of Content at Monte Carlo, a data reliability company. Prior to joining Monte Carlo, Molly served as editor-in-chief of the Uber Engineering Blog and lead program manager for Uber’s Technical Brand team, where she spent countless hours helping engineers, data scientists, and analysts write and edit content about their technical work and experiences. She also led internal communications for Uber’s Chief Technology Officer and strategy for Uber AI’s Research Review Program. In her spare time, she freelances for USA Today, reads up on all the latest trends in data, and volunteers for the California Historical Society.

Lior Gavish

CTO and Co-founder of Monte Carlo

Lior Gavish is CTO and Co-Founder of Monte Carlo, a data reliability company backed by Accel, Redpoint, GGV, and other top Silicon Valley investors. Prior to Monte Carlo, Lior co-founded cybersecurity startup Sookasa, which was acquired by Barracuda in 2016. At Barracuda, Lior was SVP of Engineering, launching award-winning ML products for fraud prevention. Lior holds an MBA from Stanford and an MSC in Computer Science from Tel-Aviv University.

[Announcing O'Reilly's Early Release]
[Announcing O'Reilly's Early Release]
[Announcing O'Reilly's Early Release]
[Announcing O'Reilly's Early Release]