Data Reliability

Data Integrity vs. Data Validity: Key Differences with a Zoo Analogy

Imagine a zoo employee entering data on the heights of different animals and they accidentally swap the data for giraffes…

5 Helpful Data Quality Frameworks You Need to Know

A data quality framework is the methodology an organization puts in place to assess and improve its level of data…

11 Ways To Stop Data Anomalies Dead In Their Tracks

11 proactive data quality practices for preventing data anomalies and stopping them before they occur.

The Chaos Data Engineering Manifesto: Spare The Rod, Spoil Prod

Chaos data engineering is another lesson we can learn from software engineers: break stuff to make it more reliable.

What is Data Validity?

Data validity simply means how well does data meet certain criteria. Learn the most common rules to put in place…

Data Vault Architecture, Data Quality Challenges, And How To Solve Them

How Pie Insurance improves data quality across their data vault architecture.

Modern Data Quality Management: A Proven 6 Step Guide

This 6 step data quality management framework has helped hundreds of organizations achieve higher quality data across their modern data…

Data Contracts: Silver Bullet or False Panacea? 3 Open Questions

Three open questions data contracts still need to answer for engineering teams.

7 Data Quality Checks in ETL Every Data Engineer Should Know

With the right data quality checks in ETL pipelines, you can identify and fix issues in near real-time and build…

Meaningful Product Experimentation: 5 Impactful Data Projects for Building Better Products

How data teams and product leaders can do product experimentation right and other impactful data projects for building better products.

Top 5 Data Engineering Deep Dives in 2022

How do you engineer field-level lineage, data anomaly monitors, Spark lineage, or data pipeline circuit breakers? We’re glad you asked. …

Our Top 5 Data Mesh Articles In 2022

We focused on implementation best practices for one of our favorite data quality topics: the data mesh.

Data Quality Testing: 7 Essential Tests

Data reliability on your radar? Get started with these 7 must-have data quality tests, including null value, numeric distribution, and…

Data Contracts – Everything You Need to Know

Data contracts aren't in-depth legal documents, rather they codify data needs and schemas upfront to ensure transparency and build trust…

9 Best Practices To Maintain Data Integrity Fit For The Cloud Era

Data integrity is an old school term with a murky meaning. We dive into a more modern interpretation with data…

How ELT Schedules Can Improve Root Cause Analysis For Data Engineers

Why Bayesian networks hold more promise segmentation analysis.

How Data and Finance Teams Can Be Friends (And Stop Being Frenemies)

Part one in a practical data leader series: how data leaders can better work with the finance team.

What’s Next for Data Engineering in 2023? 10 Predictions 

Data trend predictions from two industry veterans who have made big bets on the future of data engineering.

How to Quickly Connect Power BI to Snowflake

Power BI Desktop connects to Snowflake easily in much the same way that all other data sources are connected. Here’s…

The 7 Tenets Of Building A Data-Driven Culture

Check out the 7 rules that helped Cribl transform into a data-driven culture with critical assets used by 60% of…

The Data Engineer’s Guide to Backfilling Data

You may know how to move data from one place to another quickly and accurately, but backfilling data can get…

4 Data Mesh Principles To Get One Step Closer To Data Nirvana

There is no one single recipe for building a data mesh. Instead, there are core principles that guide you. Here…

Where the Data Silos Are

You’ve heard of shadow IT, but what about shadow data? Read on to see where the data silos are and…

The Fight for Controlled Freedom of the Data Warehouse

The data gatekeeper is dead, long live the…oh no what have we done?…

How to Load and Stream Data from AWS S3 to Snowflake Using Snowpipe

Using Snowflake’s Snowpipe, it’s possible to upload a CSV to a S3 bucket and within 60 seconds see the data…

DataOps vs. DevOps Explained

DataOps draws many parallels from DevOps, but from an implementation standpoint, the responsibilities and skillset differences couldn’t be more different.

How to Make Data Anomaly Resolution Less Cartoonish

Fixing broken data doesn’t have to be a game of whack-a-mole. Here’s how to speed up your data incident resolution…

The Ultimate Guide To Data Lineage

Data lineage is a must-have feature of the modern data stack, yet we're struggling to derive value from it. Here's…

DataOps Explained: How To Not Screw It Up

DataOps merges data engineering and data science teams to support an organization’s data needs, in a similar way to how…

Don’t Make a Schema Change Before Answering These Five Questions

Not all schema changes are equal. Here is what to ask yourself before pushing your code off to production.

Is Modern Data Warehouse Architecture Broken? 

The modern data warehouse architecture creates problems across many layers. Consider instead an immutable data warehouse for scale and usability.

Batch Processing vs Stream Processing: The Data Quality Edition

Learn how to achieve high-quality data through batch processing vs stream processing by implementing data observability to your pipelines.

Data Observability vs. Data Testing: Everything You Need to Know

You already test your data. Do you need observability, too?…

How to Treat Your Data As a Product

Your company wants to "treat data like a product." Great! What does that mean?…

What’s In Store for the Future of the Modern Data Stack?

Bob Muglia, the former CEO of Snowflake, discusses what’s next for the tooling and technologies powering data analytics and engineering.

How to Build Your Data Reliability Stack

Data reliability is a critical focus for modern data teams. Here's how to get started.

Data Observability: Five Quick Ways to Improve the Reliability of Your Data

Five common data observability use cases and how they can help your team improve data quality at scale and trust…

segment tag