The Weekly ETL: Will Data Engineering Ever Be Sexy like Data Science?

In Monte Carlo’s Weekly ETL (Explanations Through Lior) series, Lior Gavish, Monte Carlo’s co-founder and CTO, answers a trending question on Reddit about some of data engineering’s hottest topics. 

Reddit thread can be found here.

Reddit user /SWE-Aaron asks if data engineering will ever get the same attention as data science and whether that would actually be a good thing. As someone who has been in the engineering field for two decades, I can honestly say that data engineers are the backbone of your data team. The data engineer is analytical, visionary, and writes code, just like the data scientist. In addition, they are developing and implementing new tools and frameworks that benefit the entire organization (for a better explanation of this than I could ever hope to achieve, I highly recommend checking out this post from Apache Airflow creator Maxime Beauchemin about the rise of the data engineer and his follow-up post that discusses some of the data engineer’s biggest challenges). Data scientists would find it very difficult to do their work if it weren’t for the platform and tools that data engineers build…

However, oftentimes work done by data engineers is under-appreciated as compared to analytic or data science work. Why? Most companies struggle with measuring the impact of the work done by data engineers,  given that it is one step removed from the data products that scientists create. I hate to say it, but all-too-frequently, data engineers get attention only when things break and are overlooked when data systems work well. Because of this, I always recommend that teams track how much time, money, and resources data engineers save and directly tie their work to the impact-driven by data scientists and analysts. 

I do think this perception is changing though – two trends come to mind:

  1. Companies are increasingly treating data as an internal product and investing in platforms that make highly reliable data available to all stakeholders. That work is largely driven by data engineers and data product managers.
  2. Companies are increasingly incorporating analytics and ML into their own products and making it a core part of their own offering. As data becomes mission critical, data engineers are charged with building the pipelines and operations that go into making data operationalization a reality.

In the next few years, I strongly believe data engineering will receive as much attention as Data Science or even more! In fact, according to Mihail Eric in KDnuggets, “there are 70% more open roles at companies in data engineering as compared to data science,” so this is already happening.