Analytics engineering: the path to building a BI
Evolving means going from something simpler to something more complex, developed, perfected. That's exactly what happened with tools and systems for business intelligence (BI) in recent years, right?
So much so that, now, we are witnessing the latent need for a new position in data science: that of engineer and analytics engineer .
These new professionals are responsible for new functions, such as the step-by-step process of building a successful BI project. And that's what we'll show you in this article.
And if you want to know even more, take advantage of the Indicium Academy content : everything we know and do from end to end in data science in one place.
Good reading!
The step-by-step process of a BI
The road to successful BI is made up of analytics engineering techniques and good practices . To help you, Indicium Academy has organized this journey into 6 steps:
- mapping of the past
- integration of data sources
- data transformation
- testes
- documentation
- preview
By following this path, you will be able to create readable BI projects for non-data science clients to make better data-based decisions.
All ready? So, let's go! 🧐
1st step: mapping the past
Its starting point is the past, that is, the entire history up to the moment of analysis . When mapping data for extraction, analytics engineers already define the sources of information , in addition to the tables and datasets needed for the BI project.
Using a database , information is modeled. This way, data integrity is guaranteed and redundancy is reduced . Data modeling also helps analysts know where and how the information to be observed is stored.
Data models define how this information will be organized and what relationships are established between them. For this, we have some types of modeling , which are mainly these:
- conceptual modeling: concerned with creating a model of the real world and creating a common vocabulary for all data users.
- logical modeling: adds implementation details with consistent data and no redundancy.
- physical modeling: demonstrates how data is physically stored.
Then, with the data mapped, collected and modeled, it is time to integrate the sources of this information.
2nd step: integrating data sources
This step serves to integrate, organize and centralize data from different sources in a single location, such as a data warehouse , for example. To achieve this, Indicium Academy teaches you how to use ELT applications , which are divided into the following steps:
- extract raw data from one or more sources and save it in a single data repository;
- upload data to a data warehouse for use in a BI tool;
- transform data through structuring, enriching, cleaning, and converting it for use.
The extraction and loading (EL) steps are fundamental for integration. There are also important factors that must be taken into consideration during this process, such as:
- source types (SQL database, NoSQL, SaaS API, etc.)
- access types (database mirror, API, report, etc.)
- ambient (cloud, on premises etc.)
- Frequency
- data volume
- types of processing (full, incremental)
For those who work in the field of analytics engineering , it is necessary to understand that data is dynamic and constantly changing. Therefore, backup copies are essential as a form of prevention, as well as testing and documentation , as we will see later.
3rd step: transforming historical data
With the raw data extracted into a data warehouse (DW), transformation is carried out to make it ready for use. The next step is to normalize the data into columns (attributes) and tables (relations), ensuring that integrity constraints are followed.
During the transformation process, data that makes sense for analysis is selected , separating attributes that generate value. For this process, at Indicium Academy, we suggest using data orchestration tools such as DBT .
This type of tool helps you carry out all the most common transformations needed for your BI. It is in partnership with DBT, in fact, that Indicium carries out more complete projects.
4th step: 1, 2... testing!
After transforming the data, you need to test it to ensure that unwanted information was not selected. Testing is a fundamental part of the project and cannot be left out.
In the testing process, a staging model is configured , responsible for applying all changes to the other models. Next, analytics engineering experts configure the dimension and fact tables , storing single-item data and historical data respectively.
Only then are the actual data tests carried out.
5th step: documenting the data
In order to facilitate data querying in the future, analytics engineers need to document the project . An important detail is to add the complete description of each file, making the search simpler.
By creating a documentation page, these professionals help you search for essential information such as:
- data source;
- tables and their descriptions;
- size and number of lines;
- columns and their tests.
After testing and documenting the data, there is the core development of the BI project .
6th step: visualizing the data
Now that you have the data mapped, integrated, transformed, tested and documented, it's finally time to build your BI .
People who work in analytics engineering must know how to tell a story through their business intelligence reports , typically by answering questions like:
- What is the current status of the company?
- How did we get to this situation?
- How to solve such problems?
- What will be the future of the business?
To do this, the DataViz steps are followed , starting with the context , when the target audience, type of analysis and indicators are defined. Once this is established, the presentation format is chosen , taking into account the flexibility and complexity of the project.
There are good DataViz practices that should be used to assist in the construction and presentation of the report, providing more clarity to make the graph as readable as possible by the end user who, for example, may not be in the data area.
The cohesion of the project is another fundamental element to help read the graphics. One way to guarantee this is to establish the function and order of each BI item and maintain them until the end of the presentation .
Visual elements must be selected according to the genre of the analyzed data. There are several types and categories of charts available, such as column, bars, pie, lines and network , and all with specific functionalities.
Finally, pay attention to the hierarchy and positioning of elements, as these are the practices that guide the reading of the dashboard.
But of course that's not all. You can see more data visualization tips in this hands-on tutorial from Indicium Academy , as well as other tutorials available on our YouTube channel .
Continue learning com from the Information Academy
The world of data science is constantly evolving , so it is important to always keep up to date. Indicium Academy offers an excellent opportunity for those looking to learn in the data area.
Continue following and learning from our content as we will soon have new classes open .
Take the opportunity to subscribe to our newsletter and stay up to date with the main news and everything that happens in the world of data science .
If you liked the content, leave your feedback in the comments and share!
Bianca Santos
Redatora