Skip to main content
search
0
Scalefree Knowledge Webinars Data Vault Friday Data Vault Link Effectivity

One of the most common — and most misunderstood — challenges in Data Vault modeling is how to correctly handle changing relationships. Especially when source systems insert, delete, and even reinsert the same records over time, many teams struggle to answer a simple but critical question:

“How do I reliably determine the latest valid state of a relationship?”

In this article, we will walk through exactly how to model this scenario using Links and Effectivity Satellites in Data Vault, why this pattern is essential, and how it allows you to extract both the current and historical state of relationships in a clean, auditable way.



The Scenario: A Source Table with Changing Relationships

Let’s start by restating the problem in simple terms.

You have a source table with the following structure:

  • COL_1 + COL_2 → Business Key for Hub A
  • COL_3 + COL_4 → Business Key for Hub B
  • COL_5, COL_6 → Descriptive attributes
  • ROW_OPERATION → Insert / Delete indicator

This tells us something important right away:

  • Two business keys appear in the same row
  • Therefore, there is a relationship between those business keys
  • That relationship belongs in a Link

The complexity arises because the operational system:

  • Inserts records
  • Deletes records
  • Sometimes reinserts the exact same records later

All while the actual business keys and attributes remain unchanged.

So how do we model this in a way that preserves history, supports auditing, and still lets us easily answer:

“Which relationships are valid right now?”

In Data Vault, a Link represents the existence of a relationship — not its state.

Once a relationship between Hub A and Hub B exists, the Link record itself is immutable.
It simply tells us:

“At some point in time, this relationship existed.”

But the Link alone cannot tell us:

  • When the relationship became active
  • When it was removed
  • When it was reintroduced

This is where many implementations go wrong. Teams try to encode relationship state directly into the Link or manage it downstream in reporting logic. Neither approach scales, and both break Data Vault principles.

The correct solution is to model the effectivity of the relationship separately.

Enter the Effectivity Satellite

An Effectivity Satellite attached to a Link tracks the physical existence of that relationship over time.

Think of it as a timeline that answers one simple question:

“Is this relationship active or inactive at a given point in time?”

  • Link Hash Key
  • Load Date Timestamp (LDTS)
  • Record Source
  • Deletion Timestamp

The deletion timestamp is the key attribute. It defines until when a relationship is considered active.

A common and proven pattern is:

  • Active relationship → Deletion date = 8888-12-31
  • Inactive relationship → Deletion date = actual load timestamp when deletion was detected

(Using 8888-12-31 instead of 9999-12-31 avoids date overflow issues and is widely adopted in practice.)

Walking Through the Lifecycle of a Relationship

Let’s walk through the example step by step.

Day 1: Relationship Is Inserted

The source system delivers a row linking business keys A and B.

  • A Link record (A–B) is created if it does not yet exist
  • An Effectivity Satellite record is inserted
Deletion_Timestamp = 8888-12-31

This means: the relationship is active.

Day 2: Relationship Is Deleted

The source system no longer contains the A–B row.

We do not delete the Link. Instead:

  • A new Effectivity Satellite record is inserted
  • The deletion timestamp is set to the current load timestamp

Now the timeline clearly shows when the relationship ended.

Day 3: Relationship Is Reinserted

The same A–B relationship appears again in the source.

Important point:

  • The Link already exists → no new Link row
  • The Effectivity Satellite needs a new delta
Deletion_Timestamp = 8888-12-31

This marks the relationship as active again — without losing the history of the previous deletion.

Handling Updates: Why Updates Are Two Events

Another area where teams often struggle is updates.

In Data Vault terms:

An update is a deletion of the old version and an insertion of a new version.

For example, if a relationship changes from A–B to A–C:

  • A–C is inserted as a new Link relationship
  • The Effectivity Satellite marks A–C as active
  • The Effectivity Satellite marks A–B as deleted

This ensures that:

  • No ambiguity exists about which relationship is active
  • Historical reporting remains accurate
  • Auditors can trace every change

How to Query the Latest Valid State

Once modeled correctly, querying the current state becomes straightforward.

Current Active Relationships

  1. Join the Link to its Effectivity Satellite
  2. Select the latest satellite record per Link hash key
  3. Filter where Deletion_Timestamp = 8888-12-31

That’s it.

This pattern works consistently across:

  • Current-state reporting
  • Type 1 dimensions
  • Snapshot-based reporting using PIT tables

What About ROW_OPERATION and Valid-From / Valid-To?

It’s important to clearly separate concepts:

ROW_OPERATION

Indicators like Insert/Delete from the source system are:

  • Descriptive metadata
  • Useful for loading logic
  • Not the source of truth for effectivity timelines

They can be stored in a descriptive Satellite, but effectivity is driven by load detection.

Valid-From / Valid-To

Business validity dates are not deletion timelines.

  • They describe business meaning (contracts, subscriptions, agreements)
  • They belong in separate descriptive Satellites
  • They are typically applied in the Business Vault or downstream

Mixing business timelines with technical effectivity is a common modeling mistake.

Why Effectivity Satellites Are Not Optional

A final, often overlooked argument: dirty data.

What happens if:

  • A relationship is accidentally loaded
  • Used in reports
  • Then later corrected or removed

Without an Effectivity Satellite:

  • You cannot explain historical results
  • You cannot prove when data changed
  • You cannot support audit requirements

That’s why, in real-world Data Vault implementations:

  • Most standard Links have Effectivity Satellites
  • Most Hubs have them as well
  • Only a few special cases are exceptions

Key Takeaways

  • Links capture existence, not state
  • Effectivity Satellites capture when relationships are active
  • Deletes and reinserts are modeled as satellite deltas
  • Updates are modeled as delete + insert
  • Current state is derived by filtering on open deletion timelines

Once you adopt this pattern consistently, handling complex relationship lifecycles becomes simple, scalable, and auditable.

Watch the Video

Meet the Speaker

Profile picture of Michael Olschimke

Michael Olschimke

Michael has more than 15 years of experience in Information Technology. During the last eight years he has specialized in Business Intelligence topics such as OLAP, Dimensional Modelling, and Data Mining. Challenge him with your questions!

The Data Vault Handbook

Build your path to a scalable and resilient Data Platform

The Data Vault Handbook is an accessible introduction to Data Vault. Designed for data practitioners, this guide provides a clear and cohesive overview of Data Vault principles.

Read it for Free

Leave a Reply

Close Menu