Sample Page – Scalefree

Business Satellites in Data Vault

Watch the Video

In the latest segment of our Data Vault Friday series, our esteemed CEO, Michael Olschimke, delves into a question brought forward by a member of our audience.

“What are business satellites?”

Michael passionately explores the concept of business satellites in a dedicated session, offering in-depth insights and valuable perspectives on their significance within the Data Vault methodology. This engaging discussion is geared towards enhancing your understanding of business satellites and their role in the broader context of Data Vault architecture.

Marc Winkelmann In Data Vault Friday

Data Vault Referential Integrity

Watch the Video

In the latest installment of our Data Vault Friday series, our accomplished Managing Consultant, Marc Finger, addresses a pertinent question posed by a member of our audience.

“How is referential integrity handled in DV 2.0?”

Marc provides comprehensive insights into the intricacies of maintaining referential integrity within the Data Vault 2.0 framework. He explores the nuances and best practices associated with ensuring a robust and reliable structure that upholds the integrity of relationships within the Data Vault architecture.

Dmytro Polishchuk In Data Warehouse, Intermediate

Technical Tests of a Data Vault Powered EDW

Data Vault Powered EDW

In this newsletter, we’ll discuss and make an overview of different methods and approaches when performing technical tests of a Data Vault-powered EDW.

The testing approaches described below are aimed to ensure the integrity, reliability, accuracy, consistency, and auditability of the data loaded into your Data Vault entities as well as the information marts built on top of them. All to ensure that it is safe for your business to make decisions based on that data.

Technical Tests and Monitoring of a Data Vault powered EDW

In this webinar, our experts will a give you an overview of different methods and approaches of technical testing and monitoring of a Data Vault powered EDW. The testing approached to be discussed are suitable for different layers of your EDW solution, starting from extracting data from sources to the landing zone/staging area (Extract and Load) and ending with information marts used by end users in their BI reports. The main focus of our webinar though is testing the Data Vault 2.0 entities in the Raw Vault and Business Vault layers. The monitoring focuses on providing insights into the performance of your EDW. Starting with the modeling approach of the metrics vault and metrics marts, the areas of the source data of these entities will be covered. This data captured provides information about the process execution of your ELT processes, as well as error information. By inspecting the error marts, you can track your errors, find the root cause or boost your performance by taking performance metrics into account.

Watch Webinar Part 1 Watch Webinar Part 2

In this article:

What to expect
Testing Data Extraction Process
Testing Data Vault
Test Automation and Continuous Integration
Conclusion

What to expect

You will receive an overview of testing approaches suitable for different layers of your EDW solution starting from extracting data from sources to the landing zone/staging area (Extract and Load) and ending with information marts used by end users in their BI reports. Additionally, we will discuss test automation and its importance for continuous integration of your Data Vault-based EDW. That stated, the main focus of this newsletter though is testing the Data Vault entities in the Raw Vault and Business Vault layers.

Testing Data Extraction Process

Regardless of where the data extraction takes place – data source, persistent staging, transient staging – the main goal of testing at this phase is to prove that there is no leakage while transporting or staging the data. Comparing the input data and the target data ensures that the data has not been accidentally or maliciously erased, added, or modified due to any issues in the extraction process.
Checksums, hash totals, and record counts shall be used to ensure the data has not been modified:

Ensure that checksums over the source dataset and the target staging table are equal
Ensure that the numerical sum of one or more fields in a source dataset (aka hash total) matches the sum of the respective columns in the target table. Such sum may include data not normally used in calculations (e.g., numeric ID values, account numbers, etc.)
Make sure the row count between the source and the target staging table matches

Testing Data Vault

The core part of your Data Vault-powered EDW solution is a Raw Data Vault which contains raw and unfiltered data from your source systems that has been broken apart and loaded into hubs, links, satellites, and other Data Vault-specific entities based on business keys. This is the first point in the data pipeline in which the data lands in the Data Vault-modeled entities . Thus, specific tests are required to ensure consistency and auditability of the data after the Raw Data Vault is populated. The below test approaches are valid for Business Vault entities as well.

Testing Hubs

Hubs store business keys by separating them from the rest of the model. A hub is created for every business object. It contains a unique list of keys representing a business object that have the same semantic meaning and granularity. The business objects residing in a hub are then referenced from other Data Vault entities through hash keys calculated during the staging phase.

As such, the following tests are necessary to perform on hubs to ensure their consistency:
For a hub with a single business key, tests should ensure that:

A hub contains a unique list of business keys (primary key (PK) test)
A business key column contains no NULL or empty values (except when business key is composite)

If a hub has a composite business key, ensure that:

The combination of the values in the business key columns are unique (PK test)
Business key columns don’t contain NULL or empty values all at once

The validity of the latter point also depends on the nature of the business objects itself. It can also be that NULLs or empty values are not allowed in any of the business key columns.

For the both kinds of hubs, ensure that:

Hash key column contains:
- Unique list of values (PK test)
- No NULLs or empty values

Testing Links

A typical link defines relationships between business objects by storing unique combinations of hash keys of the connected hubs. The primary key of the link or link hash key uniquely identifies such a combination. Thus, link tests should check that:

The combination of connected hub references (hub hash keys) is unique (PK test)
Every hub hash key value exists in the referenced hub
Hub references do not contain NULLs or empty values

Regarding the last bullet point, note that the NULLs and empty values in hub references, as well as in hash key columns of other Data Vault entities, are replaced with zero keys.
For transactional (non-historized) data, transactional key columns should be included into the uniqueness tests in addition to columns with hub hash keys. Make sure that transactional keys are populated as well. Such transactional keys are usually not hashed since, as a rule, no hubs for transactions are created.
And, as for hubs, make sure that the link hash key column contains unique values and there are no NULLs and empty values.

Testing Satellites

Satellites store descriptive information (attributes) for business objects (residing in hubs) or relationships between business objects (residing in links). One satellite references either one hub or one link. Since descriptive information for business objects and relationships between them may change over time, a load date timestamp of a satellite record is added to the primary key of a satellite.

With the above said, tests for a satellite should make sure that:

The combination of a hub/link reference (the Hash Key) and the load date timestamp of a record is unique (PK test)
Every hub or link hash key value exists in the referenced hub or link
Hub or link references do not contain NULL or empty values

Multi-active satellites contain multiple active records at the same time. Thus, additional key columns (for example, Type Code, Sequence, etc.) are needed to uniquely identify a record. These additional key columns have to be part of the unique test of a multi-active satellite. Additionally, they should be tested for the absence of NULL and empty values.
The approach for testing a non-historized satellite also differs a bit from testing its standard sibling. A non-historized satellite is a special entity type that contains descriptive attributes for every corresponding record in a non-historized link. A primary key of a non-historized satellite is a link hash key. Thus, there is no need to include a load date timestamp into the primary key check. For a non-historized satellite, additionally make sure that it has a 1:1 relationship with the corresponding non-historized link. Record counts in both entities should match exactly.

Testing Other Data Vault Entities

There are other special entity types in Data Vault worth mentioning in regards to testing:

Reference hubs and reference satellites: Testing approaches are similar to standard hubs and satellites. The only difference is there are no hash keys and business keys are used directly.
Record source tracking satellites: A column representing a static source name is added to the primary key test.
PIT Table (Business Vault):
- PK test – combination of the hub/link hash key and the snapshot date timestamp columns is unique
- For every satellite reference, check that the pair of hub/link hash keys and the load date timestamp exists in the referenced satellite
- Hub/link reference does not contain NULL or empty values
Bridge Table (Business Vault):
- PK test – combination of a base link hash key and snapshot date timestamp columns is unique
- For every hub and link reference, check that a pair of hub/link hash key exists in the referenced hub or link

General Tests for all Data Vault Entities

There are some tests applicable for all Data Vault entities.
Make sure that all Data Vault entities:

Contain zero keys instead of NULL keys
Have records source columns that are populated and correspond to the defined pattern (e.g., regex). For example, check if it contains the file path where the name of the top level folder represents the name of the source system and the file name includes the timestamp of the data extraction
Don’t have NULL values in their load (snapshot) date timestamp columns

Testing Source Marts

The Source Mart is one of the facets of the Information Mart concept in the Data Vault. It is a virtualized model on top of the Raw Data Vault with the aim of replicating the original source structures. It is great for ad-hoc reporting and offers higher value for many data scientists and power users and can also be used to test consistency and auditability of the loading process into a Data Vault-powered EDW.

Source mart objects are intended to look the same as the respective source tables (including columns names). If you have source marts implemented in your EDW, make sure to compare them against the respective source tables in the staging area after the data loading process. Values and row counts of source structures should match exactly against the respective source mart objects. In the Data Vault community, this kind of test is also known as a “Jedi-Test”.

It is relatively easy to automate such comparison and make it a part of the loading process.

Testing Hash Key and Hash Diff Calculations

Hash keys in Data Vault allows business keys to be integrated in a deterministic way from multiple sources in parallel. They are the glue that binds different Data Vault entities together.

Hash diffs, on the other hand, apply to the satellites and help identify differences in descriptive attributes during the data loading process.

It is important to introduce unit tests for hash key and hash diff calculations used in your EDW, to make sure the hashed values are calculated in accordance with the hashing standards defined. Read more about requirements and templates for hashing here. Test cases for such unit tests should cover as many combinations of different data types and values (e.g. NULL and empty values) as possible, to ensure that they are calculated consistently.

In case your EDW exists on different DBMS platforms (e.g. during migration process or data security regulations), the above test cases can be used to make sure that your hash calculations are platform agnostic, meaning that they produce the same result on different platforms. There is a common use case, when a link in an on-premise DBMS platform references a hub that was already migrated to a Cloud platform. Such unit tests can be run on both platforms to ensure consistency of hashing during a migration.

Testing Business Rules

Unlike the hard rules that do not alter or interrupt the contents of the data, maintaining auditability, soft or business rules enforce the business requirements that are stated by the business users. Examples of business rules can include:

Concatenation (last name and first name)
Standardizing phone numbers
Computing total sales (aggregation)
Coalescing, etc.

Apart from the relatively simple examples listed above, there might also be some more complex business rules involving sophisticated calculations, data transformations, and complex joins. Depending on the use case, the results of applying such rules usually end up in the Business Vault (i.e. a Business Satellite) and later in the Information Mart layer where they are consumed by the business users. Thus, testing business rules is an important part of the information delivery process.

Business rules are usually also a subject of unit testing that must be done continuously during the development and CI process. To perform such a unit test, we need some expected values, in the best case provided by the business, i.e., an expected net sales value for a defined product or a set of products in a specific shop on a named day based on the real data. The net sales calculation from our Business Vault is then tested against the given expected result.

Test Automation and Continuous Integration

All of the above described tests should be automated as much as possible and run by EDW developers during the development process. Successful tests should be an obligatory condition for introducing any new change to your EDW code base. That is achievable by using DevOps tools and enabling continuous integration (CI) in your DWH development lifecycle. Running automated tests each time code is checked or merged ensures that any data consistency issues or bugs are detected early and fixed before they are put into production. As a rule, a separate test (or CI) environment is created for the purpose of running automated tests.

Here are some general recommendations for creating and running a test environment:

Create the CI environment as similar as possible to the production environment
Create test source databases and source files derived from real data
The test source files and source databases should be small so tests can run quickly
The test source files and source databases should also be static so that the expected results are known in advance
Test full load and incremental load patterns since the logic of both patterns is different in most of the cases
Run tests not only against the changes to be merged but also against all the downstream dependencies, or even the whole loading process in general to prevent regressions.

Conclusion

In this newsletter, we provided an overview of different methods and approaches for the process of technical testing a Data Vault powered EDW.

We covered testing of different stages of the EDW load including extraction of the data from data sources, loading Data Vault entities, and information delivery process, though primary focus was placed upon loading Data Vault entities.

We also covered unit testing hash key & hash diff calculations.

It is important to make sure that your hashing solution is platform/tool agnostic, especially during the migration process.

We also learned that testing business rules is a key part of the information delivery process since they interpret the data and define what business users see in their reports. We highlighted the importance of unit testing the business rules and cooperation with the business in respect to defining test cases and expected results.

Furthermore, we also stressed the significance of test automation during the development phase as well as for enabling continuous integration and provided recommendations for creating and running a test environment.

We go even deeper into this topic in our webinar. Make sure to watch the recording for free!

Markus Lewandowski In Beginner, Salesforce

Wie Integriere Ich Meine Office-Welt in Salesforce

Watch the Webinar

Entdecken Sie in unserem Webinar, wie Sie Ihre Office-Welt und alle dazugehörigen Abteilungen eines Unternehmens in die Salesforce-Umgebung integrieren können.

Lernen Sie, welche Abteilung sich mit welcher Salesforce-Cloud abbilden lässt und wie dadurch die Zusammenarbeit, Kommunikation und Geschäftsprozesse optimiert werden können.

Verpassen Sie nicht die Gelegenheit, Ihre Effizienz zu steigern und Ihre Vertriebsaktivitäten zu optimieren.

Registrieren Sie sich jetzt für wertvolle Einblicke und praxiserprobte Ratschläge.

Watch Webinar Recording

Webinar Agenda

1. Salesforce Clouds
2. DocuSign
3. Kollaboration
4. E-mail
5. Mowi

Lorenz Kindling In Data Vault Friday

Soft Deletes in Data Vault 2.0

Watch the Video

In the latest edition of our Data Vault Friday series, our knowledgeable BI Consultant, Lorenz Kindling, delves into a question posed by an audience member.

“Can you use soft deletes for GDPR or Security in Data Vault?”

Lorenz provides valuable insights into the application of soft deletes within the Data Vault framework, specifically addressing their potential role in achieving GDPR compliance and enhancing data security measures.

Markus Lewandowski In Agile Data, Salesforce

Agile Kanban Made Easy for Salesforce: Introducing the MOWI App

MOWI – The Agile Kanban Board in Salesforce

Agile methods like Kanban help teams manage projects with transparency, flexibility, and efficiency. However, many organizations struggle with integrating Kanban into their existing systems. This is where MOWI comes in.

MOWI is a lightweight Salesforce application that brings Kanban boards directly into the Business Application Platform. Built around the three objects Mission, Objective, and Work Item, MOWI makes project and task management in Salesforce simple, structured, and intuitive.

In this article:

Why MOWI?
How MOWI Works
Conclusion

Why MOWI?

Fully integrated into Salesforce – no additional tools required.
Improved transparency – all team members can see progress and responsibilities at a glance.
Flexible collaboration – goals, tasks, and subtasks can be managed in a Kanban board with simple drag and drop.

How MOWI Works

Mission – the bigger picture

A Mission groups together multiple Objectives and serves as the overarching project goal. This makes it easy to organize even complex initiatives.

Example: A “Product Launch” mission that includes several objectives such as marketing, development, and sales preparation.

Objective – the core element

Objectives are the main tasks or user stories, comparable to cards on a Kanban board. Each Objective contains detailed information and can also be linked to Follow-up Objectives for more complex scenarios.

Example: “Create marketing campaign” as an objective, linked to multiple work items such as landing page design or content creation.

Work Item – the detailed steps

Work Items are the subtasks of an objective. They provide a more granular view of progress and move through phases such as Open, In Progress, and Closed.

Example: “Design landing page” as a work item within the “Create marketing campaign” objective.

The Kanban View in Salesforce

In the Kanban View, all objectives are displayed in columns such as Active, Blocked, Testing, or Done. Users can drag and drop objectives across columns as work progresses.

The result: clear visibility, better communication within the team, and improved productivity.

Conclusion

With MOWI, you bring agile project management directly into Salesforce – lightweight, integrated, and effective. Download now for free on AppExchange.

Michael Olschimke In Data Vault Friday

Data Vault Modeling Styles

Watch the Video

As part of our engaging Data Vault Friday series, our CEO, Michael Olschimke, addresses a pertinent question raised by an audience member.

“What’s your view on other Data Vault philosophies? Some of my colleagues received training in such modeling styles, but the philosophy contains some substantial differences related to the CDVP2 certification.”

In this insightful video, Michael shares his perspectives on various Data Vault philosophies beyond the CDVP2 (Certified Data Vault Practitioner Level 2) certification. Drawing from his extensive experience and expertise, he navigates through the nuances of different modeling styles within the Data Vault framework.

Michael Olschimke In Data Vault Friday

Point in Time vs. Record Source Tracking in Data Vault

Watch the Video

In the latest installment of our informative Data Vault Friday series, our CEO, Michael Olschimke, takes on a thought-provoking question posed by a member of our engaged audience.

“How are record source tracking satellites used in a Data Vault, and if it is not used in a project, then how can PIT tables come into play in tracking the customers or business keys? And if we are taking the PITs for tracking the keys, does this mean we should take the daily snapshots of the data?”

Michael intricately unpacks the nuances of utilizing record source tracking satellites within the context of Data Vault methodology. He provides valuable insights into the role of PIT (Point-in-Time) tables in effectively tracking customers or business keys, shedding light on their significance in scenarios where record source tracking satellites might not be employed.

Delving deeper into the intricacies, Michael elucidates whether the use of PIT tables necessitates the capture of daily snapshots of the data, offering a comprehensive perspective for data professionals seeking clarity on these vital aspects of data modeling.

Hernan Revale In Data Tools

Exploring Datavault4DBT: A Practical Series on the DBT Package for Data Vault 2.0 – Vol. 1: The Staging Layer

Exploring DataVault4dbt

Last year Scalefree released DataVault4dbt, a Data Vault 2.0 open-source package for dbt, which includes loading templates for creating and Data Vault 2.0 modeling entities following up-to-date standards and best practices. If you want to read more about the general content of the package and its motivation, you can do so here.

We’re excited to launch a series of insightful posts and webinars, showcasing practical implementations of DataVault4dbt. This will empower you to leverage its full potential in your data warehousing endeavors. Today, we’ll spotlight its application in the staging layer.

In this article:

Before we start with DataVault4dbt
Installing DataVault4dbt package on dbt
Using the macro for staging our source data
Conclusion

Before we start with DataVault4dbt

We will assume some previous knowledge related to Data Vault 2.0 and dbt. Besides, for the following examples, we will be using dbt Cloud IDE connected to Snowflake. For an updated list of supported platforms, check the package’s GitHub repository.

Also, bear in mind that for optimal use of the macros, you must meet a couple of prerequisites:

Flat & Wide source data, accessible in your target database
A Load Date column signifying the time of arrival in the source data storage
A Record Source column detailing the origin of the source data, such as the file location within a Data Lake

In our case, we used and adapted the data from the jaffle_shop example project available on dbt.

Installing DataVault4dbt package on dbt

Installing DataVault4dbt is like installing any other package on your project. You will need to follow two simple steps:

1.Add it to your packages.yml file

2. Run dbt deps

Using the macro for staging our source data

According to the documentation for the staging layer of DataVault4dbt, this layer primarily focuses on hashing. It also offers functionalities like creating derived columns, conducting prejoins, and adding NULL values for missing columns. Rather than diving deep into the technical aspects of each macro component, which are comprehensively covered in the documentation, let’s dive straight into its application!

A. Basic source information

Identifying the Source Model (source_model):

When referencing a source, adopt the dictionary format: ‘source_name’: ‘source_table’.
For models within our dbt project, just use the model name: ‘source_table’.

Setting Load Date Timestamp (ldts) & Record Source (rsrc):

Both can reference a column from the source table or a more detailed SQL expression.
Additionally, for the Record Source, you can use a static string beginning with ‘!’, like ‘!my_source’.

Example

DataVault4dbt: A table with two table blocks.

source_model: Calls an already created table on dbt named ‘orders_example’.
ldts: Calls a timestamp column from our source model.
rsrc: Calls a column which contains a string referring to our record source name.

B. Hashing

In DataVault4dbt, the hashed_columns parameter outlines how to generate hashkeys and hashdiffs. For each hash column:

The key represents the hash column’s name.
For Hashkeys, the value is a list of business keys.
For Hashdiffs, the value will usually be a list of descriptive attributes.

Example

DataVault4dbt: Screen shot, table, different types of data.

hk_order_h: hashkey generated using two columns inputs (O_ORDERKEY and O_CUSTKEY)
hd_order_s: hashdiff generated using multiple descriptive attributes

C. Derived columns

Derived Columns in DataVault4dbt stage models allow users to directly apply specific transformations to data. They act as on-the-fly customizations, enabling immediate adjustments to data within the column itself. Essentially, if data isn’t in the desired format, with DataVault4dbt you can derive a new version right within the column using a specified rule.

When setting the derived_columns parameter, each derived column includes:

value: The transformation expression.
datatype: The datatype of the column.
src_cols_required: Source columns needed for the transformation.

Depending on how you name the derived column and the source columns, you can achieve two outcomes:

If the derived column’s name matches its source column’s name, the original column’s data will be replaced by the transformed data. This effectively means you’re overwriting the original data.
On the other hand, if the derived column’s name is different from its source column’s name, the transformation will result in a brand new column, preserving the original column’s data.

Example

price_euro: creation of a new column with the same values as the O_TOTALPRICE column.
country_isocode: creation of a new column with a static string ‘GER’.

D. Prejoining

Why Prejoin?

In certain scenarios, your source data might not have the ‘Business Key’ which is often a human-readable identifier, such as an email address or username. Instead, it might have a ‘Technical Key’, which could be an internally generated identifier or code. If you need to use the human-readable Business Key in your processing but only have the Technical Key, you would use prejoining to combine your data with another table that maps Technical Keys to Business Keys.

How to Define Prejoins in DataVault4dbt?

The DataVault4dbt package provides a structured way to define these prejoins (prejoined_columns) using dictionaries.

For every column you’re adding through prejoining, you need to specify a few things:

src_name: This is the source of the prejoined data, as defined in a .yml file.
src_table: This specifies which table you’re prejoining with, as named in the .yml file.
bk: This is the name of the Business Key column in the prejoined table or the column values you are bringing to your table.
this_column_name: In your original data, this is the column that matches up with the prejoined table. This is often a Technical Key.
ref_column_name: In the prejoined table, this is the column that this_column_name points to. It should match up with the values in this_column_name.

Note that both ‘this_column_name’ and ‘ref_column_name’ can represent either a single column or a list of columns, serving as the basis for constructing the JOIN conditions.

Example

c_name: we brought the column “C_NAME” from the customer source table, joining on orders.o_custkey = customer.c_custkey.

E. Multi active config

The multi_active_config parameter is used when dealing with source data that contains multiple active records for the same Business Key. Essentially, you need to specify which columns are the multi-active keys and the primary hashkey column.

If your source data doesn’t have a natural multi-active key column, you should create one using functions like row_number in a preceding layer. Then, add the name of this newly created column to the multi-active-key parameter. It’s crucial that the combination of multi-active keys, the main hashkey, and the ldts column be unique in the final satellite output. If you don’t use this setting, the stage is considered to have only single active records.

Example

By setting this parameter, we’ll observe consistent hashdiffs for identical Business Keys, proving beneficial in subsequent layers. If you want to know why, you can check this post.

F. Missing columns

With DataVault4dbt, the missing_columns parameter helps handle scenarios where the source schema changes and some columns no longer exist. Using this parameter, you can create placeholder columns filled with NULL values to replace the missing ones. This ensures that hashdiff calculations and satellite payloads continue to work. Essentially, you provide a dictionary where the column names are the keys and their respective SQL datatypes are the values.

Example

discount_code: creation of a new discount_code column with NULL values.

Conclusion

Scalefree’s DataVault4dbt package introduces an easy-to-use yet powerful solution for database modeling. In our case, we went through the staging layer macro, which combines best practices with the flexibility to address diverse source data needs. From hashing to on-the-fly column modifications, this Data Vault 2.0 open-source package for dbt streamlines complex processes.

As we continue to explore its potential, we invite you to join our monthly expert session for a deeper dive. Reserve your spot here and stay tuned to the package’s GitHub repository for the latest updates and support.

Michael Olschimke In Data Vault Friday

Timezone to Be Used for Timestamps in Data Vault

Watch the Video

As part of our engaging Data Vault Friday series, our distinguished CEO, Michael Olschimke, delves into a pertinent question posed by an inquisitive member of our audience.

“In DV2.0 it is advised to use the UTC zone. How to store income timestamps from incoming sources that are in other time zones (e.g. GMT)? E.g. in Azure SQL server.”

In addressing this query, Michael provides valuable insights into the best practices for handling timestamps, especially when dealing with diverse time zones within the Data Vault 2.0 methodology. Emphasizing the recommendation to utilize the UTC zone, he navigates through the considerations and strategies for storing incoming timestamps that originate from sources operating in different time zones, such as GMT.

This illuminating discussion serves as a testament to our commitment to fostering knowledge and expertise in the realm of data architecture, making our Data Vault Friday series a valuable resource for data professionals.

Michael Olschimke In Data Vault Friday

Referencing Reference Tables in Data Vault

Watch the Video

In the ongoing journey of our Data Vault Friday series, our esteemed CEO, Michael Olschimke, delves into a thought-provoking question raised by a keen member of our audience.

“Is it possible to have an m:n link between two reference tables (country to currency)?”

In addressing this query, Michael navigates through the intricacies of data modeling, shedding light on the feasibility and implications of establishing an m:n (many-to-many) link between two reference tables, specifically in the context of countries and currencies.

By exploring the nuances of this scenario, Michael provides valuable insights into the challenges and considerations associated with creating such relationships in the Data Vault framework. This engagement exemplifies the essence of our Data Vault Friday series, where practical queries are met with informative discussions to enhance the understanding of data professionals.

Michael Olschimke In Data Vault Friday

Hierarchical Link by Using an Example in Data Vault

Watch the Video

With this week’s episode of Data Vault Friday, our CEO, Michael Olschimke, turns his attention to an insightful question about the use of a Hierarchical Link:

“Can you please explain in detail the hierarchical link using an example (different from the Bill of Material one, please)?”

In response to this discerning inquiry, Michael embarks on a comprehensive exploration of the concept of hierarchical links within the Data Vault framework. Drawing upon his extensive expertise, he elucidates the intricacies of modeling hierarchical links by presenting a distinctive example, distinct from the conventional Bill of Material scenario.

Through this elucidation, Michael aims to demystify the complexities surrounding hierarchical links, providing the audience with a practical and nuanced understanding of their application in diverse contexts. His commitment to delivering insightful explanations reflects the ethos of our Data Vault Friday series, which strives to empower data professionals with valuable knowledge.

Watch the Video

Watch the Video

Data Vault Powered EDW

Technical Tests and Monitoring of a Data Vault powered EDW

What to expect

Testing Data Extraction Process

Testing Data Vault

Testing Hubs

Testing Links

Testing Satellites

Testing Other Data Vault Entities

General Tests for all Data Vault Entities

Testing Source Marts

Testing Hash Key and Hash Diff Calculations

Testing Business Rules

Test Automation and Continuous Integration

Conclusion

Watch the Webinar

Webinar Agenda

Watch the Video

MOWI – The Agile Kanban Board in Salesforce

Why MOWI?

How MOWI Works

Mission – the bigger picture

Objective – the core element

Work Item – the detailed steps

The Kanban View in Salesforce

Conclusion

Watch the Video

Watch the Video

Exploring DataVault4dbt

Before we start with DataVault4dbt

Installing DataVault4dbt package on dbt

Using the macro for staging our source data

A. Basic source information

B. Hashing

C. Derived columns

D. Prejoining

E. Multi active config

F. Missing columns

Conclusion

Watch the Video

Watch the Video

Watch the Video

Build Better Data Platforms

SOLUTIONS

TRAINING

EVENTS

KNOWLEDGE HUB

CAREERS

COMPANY