Skip to main content
search
0
Scalefree Knowledge Webinars Data Vault Friday Data Vault Link Temporality

In modern data warehousing, ensuring accurate historical records is paramount. The Data Vault methodology excels at capturing raw, unfiltered data changes over time. But what happens when your source system makes errors—linking an entity to the wrong counterpart—and then corrects them? Without the right approach, your Link tables can become confusing, making it hard to identify the true “current” relationship. This article explores an elegant solution: using an Effectivity Satellite to manage link temporality and error correction in your Data Vault.



The Problem: One-to-One Relationship with Source Data Fluctuations

Imagine two hubs in your Data Vault: Hub A and Hub B. A business rule dictates that each A-entity can be linked to exactly one B-entity at a time. Your Link table models these connections. The typical workflow is:

  1. Day 1: Source links A1 → B1 → you load this into your Link.
  2. Day 2: Source mistakenly links A1 → B2 → you load the new link.
  3. Day 3: Source corrects back to A1 → B1 → how do you capture this as the current, up-to-date link?

Since the Link table only records distinct relationships and ignores duplicates, reloading A1 → B1 on Day 3 won’t insert a new row or update any timestamp. You lose clarity on which relationship is active today.

You might be tempted to add LOAD_END_DATE or an “active” flag directly to the Link table to mark when a relationship becomes obsolete. However, this violates Data Vault best practices. The Link should remain a pure, append-only record of every relationship ever observed, without status flags or end dates. Instead, you delegate temporality to a dedicated satellite.

Introducing the Effectivity Satellite

An Effectivity Satellite sits alongside your Link and records the lifespan of each relationship. Its core columns include:

  • Link Hash Key: foreign key back to your Link record
  • LOAD_DATE_TIMESTAMP: when you first detected or ended this link
  • DELETE_DATE_TIMESTAMP: when the link was deactivated (or a far-future “end of time” for active rows)

This design cleanly separates the static relationship definition (Link) from its dynamic, time-dependent status (Effectivity Satellite).

Day 1: Initial Relationship

Link: A1–B1  
Effectivity Satellite:  
LOAD_DATE = D1  
DELETE_DATE = 8888-12-31

We load A1→B1 and mark it active by setting its DELETE_DATE to the end of all time.

Day 2: Erroneous Change

Link: add A1–B2  
Effectivity Satellite updates:  
– For A1–B1: DELETE_DATE = D2 (deactivated)  
– For A1–B2: LOAD_DATE = D2, DELETE_DATE = 8888-12-31

The old relationship is soft-deleted, and the new one is inserted and marked active.

Day 3: Correction Back to Original

Effectivity Satellite updates:  
– For A1–B2: DELETE_DATE = D3  
– For A1–B1: new row (reactivation): LOAD_DATE = D3, DELETE_DATE = 8888-12-31

Instead of touching the Link, we simply record two new deltas: ending B2 and re-activating B1. Querying the satellite for the active row (where DELETE_DATE = 8888-12-31) reveals the current link.

Loading Patterns: Full Loads vs. CDC vs. Incrementals

Your data delivery method influences how you detect deletions:

  • Full Loads: Compare all active links in the satellite against staging; any missing link implies a deletion. Insert a delta to end-date it.
  • Change Data Capture (CDC): Leverage the source’s delete events and timestamps as your DELETE_DATE_TIMESTAMP.
  • Incremental without Deletes: Combine staging deltas (inserts/updates) with a lightweight full load of just business keys. Missing keys signal deletions.

In all cases, the satellite becomes the single source of truth for link effectivity.

Handling Unreliable Source Deliveries with “Last Seen”

Sometimes, your source export may inadvertently drop rows (e.g., locked mainframe records). To avoid false deletions, maintain a Last Seen Date on your effectivity satellite. If a link hasn’t been seen for a configurable “grace period” (e.g., three weeks), a business rule in your Business Vault marks it deleted. This approach balances accuracy against source system quirks.

Querying Current Relationships

To retrieve only active links at any point:

SELECT L.*, S.LOAD_DATE_TIMESTAMP  
FROM Link L  
JOIN Effectivity_Sat S  
ON L.Link_HashKey = S.Link_HashKey  
WHERE S.DELETE_DATE_TIMESTAMP = '8888-12-31';

This simple filter returns the true, live relationships, abstracting away all historical noise and source-system corrections.

Benefits of the Satellite Approach

  • Auditability: Full history of when links were activated and deactivated.
  • Purity: Link tables remain simple, append-only, and free of flags/end dates.
  • Flexibility: Supports full loads, CDC, and incremental patterns seamlessly.
  • Business Rules: “Last seen” logic can live in Business Vault without polluting raw Data Vault layers.

Conclusion

Handling source data errors—especially when relationships ping-pong between states—requires a robust temporal strategy. By delegating link effectivity to a dedicated satellite, you maintain a clean Link table, capture every change, and easily identify the current relationship. Whether you’re dealing with full loads, CDC, or flaky source exports, this pattern scales, remains auditable, and adheres to Data Vault best practices. Implement effectivity satellites in your Data Vault to master link temporality and build a more resilient, transparent data platform.

Watch the Video

Meet the Speaker

Profile picture of Michael Olschimke

Michael Olschimke

Michael has more than 15 years of experience in Information Technology. During the last eight years he has specialized in Business Intelligence topics such as OLAP, Dimensional Modelling, and Data Mining. Challenge him with your questions!

The Data Vault Handbook

Build your path to a scalable and resilient Data Platform

The Data Vault Handbook is an accessible introduction to Data Vault. Designed for data practitioners, this guide provides a clear and cohesive overview of Data Vault principles.

Read it for Free

Leave a Reply

Close Menu