dbt’s announcements at Coalesce 2024 mark a pivotal moment, setting the stage for a new chapter in its evolution and promising to leave its mark on the industry.
In this article, you’ll explore dbt Labs’s journey across the years, ever since its early days to its growing suite of features designed to democratize business data operations.
Whether you’re a long-time Core user or new to Cloud, you’ll explore insights into how dbt’s expanded vision can elevate your data operations.
Let’s dive in.
Working With Data Before dbt
I still remember the first time I came across dbt, sometime in mid-2019.
Like most data consultancies then, Indicium focused on code-based ETL pipelines using lower-level cloud infrastructure—as you’d find in most startup tech engineering blogs.
Before dbt, code-based ETL required experienced data engineers. So as a relatively small consultancy in a very hot data engineering market, we found it difficult to scale our team with such an approach.
The Engineering Challenge
Most of our team's background was not in engineering at the time.
But our CTO was adamant about proper software engineering (SWE) best practices with data, drawing from his experience with big-data pipelines.
After reading some of Tristan Handy's blog posts about analytics engineering (AE) and dbt, it became clear that this tool could be the key to enable our analysts to work like engineers.
Or, to put it simply, to become analytics engineers.
The Complexity of Tools in the Market
In hindsight, the geniality about the early versions of what we now know as dbt Core was not the complexity of its code or features, but rather its simplicity.
Back then, most legacy ETL tools—such as Informatica or Pentaho—that cater to non-engineering professionals were clunky, full of distracting features, and had almost 0% coverage of any SWE best practices.
On the other hand, using platforms such as Snowflake and Databricks was restricted to data engineers, since it required much deeper technical knowledge than any typical analyst would have.
Despite companies building pipelines much faster than before, there was a constraint on how to scale the data organization, since only so many professionals could work on it.
And worse: many data engineers dislike talking to business users or even writing SQL queries at all, so this department was kept inevitably far from where the value of data lives.
And Then dbt Joined the Scene
Though dbt was still in its early stages, in a few months, we built an entirely new AE practice on top of it.
We’re talking about professionals without a software engineering background but with very good analytical skills.
To accelerate that movement, we developed our analytics engineering course, which is open to the public and has since trained 1,000+ professionals.
The result?
Indicium has established itself as one of the top certified dbt partners worldwide.
There is no doubt dbt is a big thing for any modern data team.
What About dbt Cloud?
dbt Core already met the needs of many of its early adopters, such as Indicium.
Our team and the open-source community were also constantly launching new features with the first versions of dbt Cloud.
So, up until recently, there was little value for us to move. But don't get me wrong, Cloud needs a lot of these features to be a good tool in itself.
The problem was the many businesses venturing into the Modern Data Stack (MDS) and discovering the numerous ways to leverage dbt Labs to improve their data platform practices.
This way, most teams became advanced users of dbt, which in my opinion, was not the main Cloud user persona.
But then, who is?
Meet dbt Cloud’s Ideal Users
I believe that there is not one, but three main personas for dbt Cloud.
User 1
First, there are the businesses born into the MDS which either don't have or don't want to keep a large data team.
User 2
Then we have enterprise companies that want to scale their dbt Core implementation into the lines of businesses (LOB).
They also need a tool that facilitates implementing best practices in data management and data governance while keeping the complexity low for less technical analytics LOB teams.
User 3
Last but not least, there are organizations relatively late in adopting a cloud data warehouse and just now migrating away from legacy data tech, such as Talend and Informatica.
Until now, it wasn't always compelling enough for some of these personas to adopt and implement dbt Cloud.
Why would that change?
In my opinion, the new announcements from dbt labs in this year's Coalesce are all in the right direction.
The Game-Changing Updates from Coalesce 2024
dbt’s announcements at Coalesce 2024 mark a pivotal moment, setting the stage for a new chapter in its evolution and promising to leave its mark on the industry.
First, dbt has just acknowledged that, to become the single tool for smaller organizations and companies without data platform teams, it must do more than transformation.
Features like orchestration, data cataloging, or even data ingestion are all necessary, but currently need a set of different resources that may be expensive and hard to combine.
The vision of dbt becoming a data control plane is good and goes in tandem with the consolidation trend Indicium’s seen in the Modern Data Stack space these past few years.
Combining dbt Core and Cloud
Arguably, the biggest announcement at Coalesce was the One dbt strategy.
First, there is real value in a hybrid approach of dbt Core and Cloud, with one being developed by platform or center-of-excellence–style teams, and the latter focused on less technical LOB ones.
A first-class experience for this combination in dbt Cloud is essential for many of our enterprise customers.
Addressing New Data Architecture Needs
Second, while most of Cloud’s advanced features had already been developed internally by dbt power users, this is not the case for hybrid cloud and data mesh architectures.
There is no single tool or platform that can deal with this ever more common practice in the enterprise, even when using the same cloud provider—e.g. Databricks and Snowflake.
With iceberg becoming the de facto standard for modern data storage, there is a real opportunity for dbt to become the missing piece between those platforms.
This would allow teams to develop their tools without losing governance and best practices in data operations.
As the long-time conundrum between code-based and no-code/low-code data transformation remains, this is a must-have feature for less technically minded engineers and a very common requirement for enterprises.
Having this resource within dbt Cloud and integrated with the dbt development lifecycle is a good move by the company.
dbt’s Key Role in the Modern Data Stack
I'm confident that dbt is the most ubiquitous tool of the modern data platform.
More than just a tool, it’s allowed companies to close the gap between business and data with the rise of analytics engineering.
Ironically, dbt Cloud suffered from the qualities of its original product.
After all, while there were always companies where Cloud was the best fit, a large part of the market found it hard to identify where dbt Core was lacking.
With the new strategy and release announcements, Cloud is solving real technical and business needs that the dbt Core did not meet.
I can see more and more use cases where Cloud provides a compelling advantage over running dbt Core.
Indicium: dbt’s Emerging Partner of the Year in the Americas
Indicium is a global data services company and a leader in the Modern Data Stack across the continent. Our mission is to help our customers modernize and scale for a data-driven future.
We provide end-to-end solutions—from strategy to execution—including data products, data platforms, consulting, and training.
Ready to revolutionize your data operations and scale your company?
Contact us and discover how we can leverage dbt’s potential to accelerate your company’s growth.
Daniel Avancini
Chief Data Officer