Skip to main content
search
0
All Posts By

Michael Olschimke

Michael Olschimke is the Co-Founder and CEO of Scalefree and a "Data Vault 2.0 Pioneer" with over 20 years of IT experience. A Fulbright scholar and co-author of Building a Scalable Data Warehouse with Data Vault 2.0, Michael is a global authority on AI, Big Data, and scalable Lakehouse design across sectors like banking, automotive, and state security.

How to Get Rid of Data Vault Load End Date

Watch the Video

In our continuous series, CEO Michael Olschimke addresses a question from the audience regarding the Load End Date in Data Vault:

“What are the options to virtualize the load end date in Data Vault? We work on an embedded solution and the Window function (LEAD/LAG) is not fast enough.”

The viewer raises a critical concern regarding the virtualization of the load end date in a Data Vault environment, especially in the context of an embedded solution. The challenge lies in the performance limitations of certain window functions like LEAD/LAG, prompting the exploration of alternative options.

Michael delves into potential strategies and solutions for efficiently managing load end dates in Data Vault. The discussion encompasses various approaches to enhance virtualization, ensuring optimal performance without compromising speed or efficiency.

By sharing insights into the intricacies of load end date virtualization, Michael provides valuable guidance for organizations grappling with embedded solutions. The exploration of alternatives offers a nuanced perspective on optimizing this crucial aspect within the Data Vault framework.

Column Store Compression vs Page Compression

Watch the Video

In our ongoing series, Michael Olschimke, our CEO, delves into a viewer’s question:

“During boot camp, Michael advises switching on page compression on the MS SQL server platform. Now when the platform supports column compression (via column store indexes) would you advise it instead? Do you have any project experience with it?”

The viewer seeks Michael’s insights on a specific query that emerged during a boot camp session. The question revolves around the recommendation to enable page compression on MS SQL Server and whether, with the advent of column compression via column store indexes, Michael would now advise using the latter. The viewer is particularly interested in understanding Michael’s project experiences related to this choice.

In response, Michael shares his expertise on the nuances of compression techniques in the context of MS SQL Server. He discusses the merits of both page compression and column store indexes, offering practical insights into their application based on project experiences. This session provides valuable guidance for optimizing storage and performance in the MS SQL Server platform, exploring the trade-offs and benefits of each compression approach.

For those navigating the complexities of compression choices within their SQL Server environment, Michael’s knowledge in this episode offers clarity and actionable recommendations, making it a valuable resource for Data Vault practitioners.

Capturing CDC Flags in Data Vault

Watch the Video

In our ongoing series, Michael Olschimke, our CEO, engages with a viewer’s question:

“If you get CDC data and insert/update/delete flag, then would you split it to separate satellite table or leave it on regular sat?”

The specific query revolves around handling CDC (Change Data Capture) data, particularly when faced with insert/update/delete flags. The viewer seeks guidance on whether to split this data into separate satellite tables or retain it within a regular satellite.

In response, Michael addresses the intricacies of managing CDC flags within the Data Vault framework. He explores the pros and cons of both approaches, providing valuable insights into the considerations that influence the decision-making process. This session offers practical advice on structuring satellite tables to effectively capture and utilize CDC flags in a Data Vault environment.

For those grappling with the nuances of integrating CDC data seamlessly into their Data Vault setup, Michael’s expertise provides clarity and actionable recommendations, making this episode of Data Vault Insights a valuable resource.

Loading CDC Data Into the Raw Data Vault

Watch the Video

In our recurring Data Vault Friday series, Michael Olschimke, our CEO, addresses a thought-provoking question about integrating the CDC data into the Raw Vault.

“A record from change data capture has a column with the type of operation (CREATE / DELETE / UPDATE) and multiple timestamp columns and it is inserted in the staging area. The record is going to be loaded in append-only (insert-only) in the Raw Data Vault. Are there attention points between CDC and Raw Vault to pay attention compared to a traditional loading of files in the staging area?”

Michael delves into the key considerations and nuances that distinguish the loading process of CDC data into the Raw Vault from the conventional method of loading files in the staging area. He sheds light on the potential attention points, highlighting crucial aspects to bear in mind during the append-only (insert-only) loading process.

For those grappling with the challenges of integrating CDC data seamlessly into the Raw Vault, Michael’s insights in this Data Vault Friday session provide valuable guidance and actionable tips.

Modelling Timesheet Data in Data Vault 2.0

Watch the Video

In the continued exploration of Data Vault concepts in our Data Vault Friday series, our CEO, Michael Olschimke, addresses a pertinent question from the audience concerning the integration of timesheet information into an internal Data Warehouse that already encompasses a Raw Vault for Resources, Projects, and Allocations.

“We are working on our internal Data Warehouse. We already have a Raw Vault for Resources (people in the organization), Projects, and Allocations (plans), but now we need to add the “timesheet” information. How should we model this, including the need to track changes?”

He provides actionable insights into structuring the Data Vault to accommodate timesheet data seamlessly, ensuring that historical changes are captured and enabling a robust system for tracking and managing timesheet-related information.

If you’re grappling with similar considerations in enhancing your internal Data Warehouse, Michael’s expertise in this Data Vault Friday session sheds light on best practices for modeling timesheet data within the broader Data Vault 2.0 architecture.

Data Mesh & Data Vault: Raw Data vs. Information

Watch the Video

In the ongoing Data Vault Friday series, our CEO, Michael Olschimke, addresses a pertinent question raised by the audience, focusing on the intersection of Data Mesh and Data Vault methodologies.

“In a data mesh/federation approach, can you share the raw data vault or the business vault across the enterprise or does the share only happen in the information mart? I believe it may happen some raw vault tables of a data domain to be joined with a raw vault of another data domain.”

Michael delves into the intricacies of Data Mesh and Data Vault methodologies, shedding light on the dynamics of sharing data across different domains within an enterprise. He provides insights into the scenarios where sharing may occur, discussing the potential of joining Raw Vault tables from disparate data domains. This elucidation aims to guide practitioners in navigating the nuances of data sharing in a federated environment.

Reference Hubs and Effectivity Satellites in Data Vault

Watch the Video

In the ongoing Data Vault Friday series, our CEO, Michael Olschimke, delves into two insightful questions raised by our audience about Reference Hubs and Effectivity Satellites.

“Should the Satellite hang from the Master Hub or Link? Our preference was to hang from the Hub as it kept the model simple and also to keep the CDC straightforward. Another reason we are leaning toward this is, that the descriptive columns belong to the Master Hub so it would be ideal to keep it as a satellite under the hub.”

“Now with the link, if there is any change in the relationship between the Master Hub Column to the Reference Hub Column in the link we would like to capture it. And it is via effectivity satellites. If we have One Master Hub column and a lot of the Reference Hubs columns do we end up in a lot of Effectivity Satellites? Or just one effectivity satellite as the driving key is the Master Hub column?”

Michael provides insights into these considerations regarding Reference Hubs and Effectivity Satellites, offering guidance on the optimal structuring of Satellites and managing Effectivity in Link relationships.

Natural Key vs. Technical ID in Data Vault

Watch the Video

In our continuous exploration of Data Vault concepts during the Data Vault Friday series, our CEO, Michael Olschimke, takes on a thought-provoking question from a member of our audience regarding natural keys.

“Can a work order ID or recommendation ID be considered a natural business key even though there is no business ‘meaning’ to these ids? or should I use the text description of the recommendation as the business key? The work order would have to be a composite key made up of details and a date for the work order which at that point seems like just using the ID makes more sense.”

Michael addresses the nuanced concept of Natural Key and Technical ID in the context of work order and recommendation IDs. The discussion provides valuable insights into making informed decisions about choosing business keys in scenarios where the identifiers lack inherent business meaning.

HL7 FHIR resources in Data Vault

Watch the Video

In our continuous exploration of Data Vault concepts during the Data Vault Friday series, our CEO, Michael Olschimke, takes on a compelling question posed by a member of our audience.

“How would you model data that is transmitted as HL7 FHIR resources in Data Vault?”

Michael delves into the intricacies of handling HL7 FHIR resources within the Data Vault framework. This session is a valuable resource for those seeking insights into the effective modeling of healthcare data encoded in HL7 FHIR resources.

Business Satellites in Data Vault

Watch the Video

In the latest segment of our Data Vault Friday series, our esteemed CEO, Michael Olschimke, delves into a question brought forward by a member of our audience.

“What are business satellites?”

Michael passionately explores the concept of business satellites in a dedicated session, offering in-depth insights and valuable perspectives on their significance within the Data Vault methodology. This engaging discussion is geared towards enhancing your understanding of business satellites and their role in the broader context of Data Vault architecture.

Data Vault Modeling Styles

Watch the Video

As part of our engaging Data Vault Friday series, our CEO, Michael Olschimke, addresses a pertinent question raised by an audience member.

“What’s your view on other Data Vault philosophies? Some of my colleagues received training in such modeling styles, but the philosophy contains some substantial differences related to the CDVP2 certification.”

In this insightful video, Michael shares his perspectives on various Data Vault philosophies beyond the CDVP2 (Certified Data Vault Practitioner Level 2) certification. Drawing from his extensive experience and expertise, he navigates through the nuances of different modeling styles within the Data Vault framework.

Point in Time vs. Record Source Tracking in Data Vault

Watch the Video

In the latest installment of our informative Data Vault Friday series, our CEO, Michael Olschimke, takes on a thought-provoking question posed by a member of our engaged audience.

“How are record source tracking satellites used in a Data Vault, and if it is not used in a project, then how can PIT tables come into play in tracking the customers or business keys? And if we are taking the PITs for tracking the keys, does this mean we should take the daily snapshots of the data?”

Michael intricately unpacks the nuances of utilizing record source tracking satellites within the context of Data Vault methodology. He provides valuable insights into the role of PIT (Point-in-Time) tables in effectively tracking customers or business keys, shedding light on their significance in scenarios where record source tracking satellites might not be employed.

Delving deeper into the intricacies, Michael elucidates whether the use of PIT tables necessitates the capture of daily snapshots of the data, offering a comprehensive perspective for data professionals seeking clarity on these vital aspects of data modeling.

Close Menu