Skip to main content
search
0
All Posts By

Lorenz Kindling

Lorenz Kindling is a BI Consultant at Scalefree specializing in Big Data and Data Science. A Certified Data Vault 2.0 Practitioner (CDVP2), he excels in DWH development, cloud solutions like Azure, and advanced data analytics. Lorenz combines technical expertise in SQL and Python with an agile, structured approach to modern data architecture.

Record Source for Links in Data Vault

Watch the Video

As part of our continuous Data Vault Friday series, our adept BI Consultant, Lorenz Kindling, delves into a thought-provoking question posed by a keen member of our audience.

“A Link refers to multiple Hubs, but we only have one Record_Source in the Link. Or the Link is loaded from more than one Source System. What do we use as a Record Source?”

In response to this intriguing query, Lorenz delves into the critical aspect of determining record sources for links within the Data Vault methodology. He shares insights into the best practices and considerations when dealing with scenarios where a Link is associated with multiple Hubs or loaded from various Source Systems.

Lorenz’s comprehensive analysis provides clarity on the nuanced decisions involved in selecting an appropriate Record Source, ensuring that the Data Vault model maintains accuracy and coherence. This discussion underscores Lorenz’s commitment to offering practical guidance to data professionals navigating the intricacies of link modeling.

Agile Development with Data Vault 2.0

Watch the Video

In our continuous Data Vault Friday series, our seasoned BI Consultant, Lorenz Kindling, takes the spotlight to address a pertinent query posed by an engaged member of our audience.

“I have a problem with the business value not delivering. Is there a perfect solution?”

Lorenz, drawing from his wealth of experience and expertise, delves into the nuances of overcoming challenges related to the delivery of business value in the context of agile development. He shares insights and practical solutions to ensure that the delivery process aligns seamlessly with the intended business outcomes.

Lorenz’s thoughtful analysis provides valuable guidance for individuals navigating the complexities of agile development within the framework of Data Vault methodologies. This engaging discussion underscores his commitment to empowering data professionals with actionable insights and best practices.

Choosing the Right Tech Stack for an Open-Source Powered EDW

Open-Source Powered EDW

Choosing the right technology stack is a critical decision when building an open source powered Enterprise Data Warehouse (EDW). The technology stack consists of various components, including databases, automation tools, DevOps, Infrastructure, and visualizations, which work together to enable efficient data management, processing, and analysis.

In this blog article, we will dive deeper into the topic of selecting the right tech stack for an open source powered EDW. We will explore different aspects to consider, such as evaluating vendors, leveraging open source products, and understanding the key components of a robust tech stack. By the end of this article, you will have a better understanding of the factors to consider when selecting the right tech stack for your EDW.

Choosing the right Tech Stack for an Open Source powered EDW

Join our webinar as our expert dives into the process of selecting the tech stack for your open-source Enterprise Data Warehouse (EDW) project. Learn more about essential considerations such as evaluating vendors, leveraging open-source products, and understanding key components like databases, automation tools, DevOps, infrastructure, and visualization. Furthermore, discover the power of combining Data Vault 2.0 with an open-source tech stack and learn how it can empower your EDW project.

Watch Webinar Part 1Watch Webinar Part 2

Evaluating Vendors and Leveraging Open-Source Products:

When embarking on the journey of building an open-source powered EDW, it is crucial to evaluate vendors and leverage open source products effectively. By choosing reputable vendors and open source solutions, you can ensure reliability, community support, and continuous development. Evaluating vendors involves assessing their expertise, reputation, and compatibility with your project requirements. Additionally, leveraging open source products provides flexibility, cost-effectiveness, and access to a vast community of contributors and developers.

Understanding the Key Components of a Robust Tech Stack:

A robust tech stack for an open source powered EDW comprises various components that work together to enable efficient data management and analysis. Here are some key components to consider:

blank

Databases:

Choosing the appropriate database technology is vital for efficient data storage and retrieval. Options like MongoDB, PostgreSQL, MySQL, or other databases that align with your project requirements should be considered

Automation Tools:

Automation tools play a crucial role in the development process of an EDW. These tools greatly accelerate the development process, particularly in a Data Vault project. One example of an open source automation tool is dbt (data build tool), which can be combined with Scalefree’s self-developed package DataVault4dbt. These tools help streamline the development process and make the development team more efficient.

DevOps and Infrastructure:

Having a stable scheduler or a similar tool to load the data regularly from the sources into the Data Warehouse is important. Options such as Airflow can be considered for this purpose. Additionally, having a DevOps tool for project management is essential. These tools help structure the work and make the development team more efficient, especially when using agile methodologies like Scrum.

Visualization:

Effective data visualization is crucial for analyzing and understanding the data in an EDW. There are various open source visualization tools available, such as Grafana, Superset, or Metabase, which provide powerful capabilities for creating insightful visualizations and dashboards.

Why Data Vault 2.0 is a Powerful Choice in Combination with an Open Source Tech Stack:

Combining Data Vault 2.0 with an open source tech stack offers a powerful solution for building an efficient, scalable EDW. The agile concepts used in Data Vault make it easier to gradually build an open source tech stack over time, starting with basic needs and expanding as necessary.
It should be noted that checking the readiness of an open source automation tool for Data Vault and having Data Vault templates in place is crucial. These components enhance efficiency, streamline development, and ensure smooth integration in an open source powered EDW environment.

Benefits of an Open Source Powered EDW:

Building an open source powered EDW offers several advantages. Firstly, open source solutions often provide a vast community of developers, ensuring continuous support, updates, and improvements. Secondly, open source products can be customized and tailored to meet specific project requirements. This flexibility allows you to adapt the tech stack to your organization’s needs and scale as your data processing requirements grow. Lastly, open source solutions typically offer cost-effectiveness by eliminating or reducing licensing fees, making them an attractive option for organizations of all sizes.

Considerations for Scalability and Performance:

Scalability and performance are crucial factors to consider when selecting the right tech stack for an open source powered EDW. As your data processing needs grow, it’s important to choose a tech stack that can scale horizontally or vertically to handle increasing workloads. Technologies like Kubernetes can be considered for container orchestration and load balancing to ensure efficient utilization of resources and smooth scalability. Additionally, performance optimization techniques, such as caching mechanisms, data indexing, and query optimization, should be considered to ensure fast and efficient data retrieval and processing.

Security and Data Privacy:

When dealing with enterprise data, security and data privacy are of utmost importance. Ensure that the chosen tech stack incorporates robust security measures and follows best practices for data encryption, access control, and secure communication protocols. Regular security audits and updates are essential to address any vulnerabilities and ensure compliance with data privacy regulations.

Summary

Picking the right tech stack for an open source powered EDW is a crucial step in building an efficient and scalable BI-System. By evaluating vendors, leveraging open source products, and understanding the key components of a robust tech stack, you can ensure a solid foundation for your EDW. Databases, automation tools, DevOps and Infrastructure, and visualization choices play vital roles in creating an effective and customizable solution. Embracing open source solutions provides flexibility, community support, and cost-effectiveness, making it an ideal choice for organizations seeking efficient data processing and analysis capabilities. Considerations for scalability, performance, security, and data privacy are important to ensure the success of your EDW implementation.

In conclusion, the selection of a tech stack for an open source powered EDW requires careful consideration of various factors. It is essential to evaluate vendors, leverage open source products, and understand the key components that contribute to a robust tech stack. By making informed choices and aligning the tech stack with your project objectives, you can build a scalable and efficient EDW that empowers your organization to process and analyze data effectively.

If you are interested to learn more about the topic, watch the recording here for free.

Masking Business Keys from Hubs for Privacy in Data Vault

Watch the Video

In our continuous Data Vault Friday series, our experienced BI Consultant, Lorenz Kindling, delves into a pertinent question posed by an engaged member of our audience.

“How to mask business keys from Hubs in a GDPR-compliant way?”

Lorenz, with his wealth of expertise, provides insightful guidance on the crucial matter of masking business keys while ensuring compliance with the rigorous regulations outlined by GDPR. With data privacy and security at the forefront, he explores effective techniques and strategies to safeguard business keys within the Hub entities, striking the delicate balance between usability and GDPR adherence.

This informative discussion is a valuable resource for data professionals navigating the complexities of GDPR compliance within the Data Vault framework, offering practical solutions and best practices.

Utilizing Potentials of Data Vault 2.0 – Overcoming Bad Practices – Part 1

Watch the Webinar

What are common mistakes when applying Data Vault 2.0 in enterprise data warehouse projects? Do you have questions regarding Data Vault modeling  and the realization of GDPR causes you great difficulties or is your project stuck because you are delivering no business value?

This webinar describes common Anti-patterns of Data Vault 2.0, their consequences, and the solution to eliminate them from your current or in your future projects.

Tune in and learn more to avoid Data Vault 2.0 bad practices and apply simple solutions.

Watch Webinar Recording

In this article:

Webinar Agenda

1. How to use Data Vault 2.0 for modeling business information
2. How to avoid the pitfalls of being unable to deliver business value
3. How to mask Business Keys from Hubs for privacy

Close Menu