Modern Data Stack: a modern approach to data

6
min
Created in:
Aug 4, 2020
Updated:
7/15/2024

The Modern Data Stack (MDS) is a strategy for building modern infrastructures that help companies overcome data integration, organization and management challenges.

Companies of all sizes already understand the existence of the power of data, recognize the importance of its use in business, but do not know how to overcome the challenges regarding the organization, integration and management of the information that arise.

In this sense, the Modern Data Stack (MDS), or modern data approach, emerges. A concept that has arrived to revolutionize and modernize companies’ data infrastructure.

Organizations that want to grow and remain competitive need to invest in a robust data infrastructure, capable of managing large volumes of information. This can be done with the Modern Data Stack.

In this post, you will read a clear and summarized explanation of this approach we use here at Indicium .

What the Modern Data Stack is

The Modern Data Stack (MDS) is a strategy for building modern infrastructures that help companies overcome data integration, organization and management challenges.

The MDS is the new combination of best practices and tools for creating data infrastructures.

One of its most striking features is the combination of several open-source tools to respond to the demands of a complex data infrastructure in a highly efficient way.

What does this mean in practice?

With a Modern Data Stack, it is possible to combine tools that perform different functions, such as integrating, storing or visualizing data, to create a modern, changeable and more independent data structure .

For example, consider a company that has drastically increased its customer base and needs to expand its data storage solution.

If it uses the MDS, it will have two options:

  1. adapt its current solution to the new demands.
  2. replace it with another tool that meets its needs, without having to completely redesign its data infrastructure.

In other words, with the Modern Data Stack, organizations have more flexibility to make specific adjustments and reinvent their structure without having to completely transform it. The result?

Fewer costs, more scalability and autonomy.

Today, thanks to new technologies and accessible tools, it is much easier to adopt the MDS.

However, to be successful in implementing these practices, you need to understand how all the pieces fit together.

Building a Modern Data Stack

An efficient data structure combines several services into a data stack.

Overall, a data stack has three fundamental functions :

  1. collecting and integrating data into a data warehouse (a “home” for the data).
  2. clean them and transform them into information.
  3. add value to decision-making through intuitive visualizations  such as BI dashboards .

All of these functions are processes in a data pipeline (a flow through which data enter, are processed and transformed).

The tools used for each of these processes form the data stack. And, although the architecture of a pipeline varies according to each company, all data pipelines have these processes incorporated.

A Modern Data Stack (MDS).

To further clarify the MDS for you, below we present, according to their respective processes, the main tools available on the market that are successful in thousands of data projects of all sizes in Brazil and abroad.

1) Data collection and integration

Making data available from multiple isolated sources for analysis is one of the main challenges of data projects. To overcome this, you need to invest in data collection and integration .

Tools like Fivetran and Stitchdata are the leaders in cloud data integration. They allow you to move data from hundreds of sources, such as ERPs, CRMs, databases, REST APIs etc., directly to a data warehouse (cloud-based or on-premises). Furthermore, they can be combined.

Therefore, there is no need for large investments in software licenses or implementation hours.

Additionally, companies looking to collect more accurate data online and offline can also use Segment or Snowplow to get a complete view of their customers.

2) Data warehouse

Another fundamental step of the Modern Data Stack is the transformation of raw data into modeled data, which occurs within a data warehouse (DW).

By centralizing data transformations in DW, there are huge efficiency gains in the project, especially through the ELT approach, which increases flexibility in the pipeline and guarantees autonomy for business analysts to define business rules in DW, accelerating the project in months.

In a data warehouse, the two main Modern Data Stack tools used for data transformation are dbt and Dataform.

Another recent and essential innovation in this approach are cloud DWs, such as Amazon Redshift and Google Big Query, which allow you to quickly store and query huge volumes of data through their scalable architecture.

3) Business intelligence (BI)

Analytical intelligence is a priority in Modern Data Stack.

With a modern data infrastructure in place, you can use different business intelligence tools to visualize, analyze and generate insights from data.

There are several robust open-source alternatives for this, such as Metabase and SaaS platforms , such as Microsoft PowerBI , Looker , Tableau, among others.

Important: it is imperative that the modern approach is that BI does not have an end in itself, but quickly generates value for the company.

4) Machine learning

Machine learning , artificial intelligence and modeling are advanced analytics techniques applied for more complex analyzes within the data stack.

To this end, in addition to the various libraries in the R and Python languages , tools such as MLFlow and Kedro help in the execution of predictive and prescriptive models, and optimize the development process, reducing the time between data modeling and use, the Achilles heel of any advanced analytics project.

5) Deployment

Tools such as Docker and Kubernetes are widely used to deploy in conjunction with orchestrators, such as Airflow and Prefect .

The difference between these technologies is that all the “Lego pieces” talk harmoniously to each other, ensuring that data flows in harmony throughout the data structure.

Modern Data Stack for everyone

The Modern Data Stack (MDS) is the link between raw data and business intelligence, that is, it is an integrated system of applications that collects, combines, analyzes and realizes the value of data for companies.

Inserting MDS is essential for modern companies that want to succeed in the data era.

Fortunately, data stack components are much cheaper, simpler to configure and handle.

Thus, companies of all sizes can use it to gain a competitive advantage and develop analytical maturity .

Would you like to learn more about the Modern Data Stack?

We have an e-book that covers all you need to know about this subject.

Understand how to optimize data operations in your company. Access your e-book here, it's free.

Tags:
Modern Data Stack
All
Business intelligence
Data warehouse

Daniel Avancini

Chief Data Officer

Isabela Blasi

CBDO and co-founder at Indicium

Keep up to date with what's happening at Indicium by following our networks:

Prepare your organization for decades of data-driven innovation.

Connect with us to learn how we can help.