Skip to main content
search
0
Scalefree Knowledge Webinars Data Vault Friday Multi-Tenant Data Vault

Multi-Tenant Environment

Designing and maintaining a Data Vault in a multi-tenant environment presents unique challenges. When a data warehouse must handle not just internal data, but also data from dozens of external clients with slightly different processes and systems, the complexity increases dramatically.

A recent question we received highlighted this exact situation:

“I’m struggling with link management and the evolution process in a multi-tenant warehouse, especially putting all data together in the Information Mart. Our Data Warehouse contains internal data as well as shared data from our clients, for which we perform job requisition processes using their internal systems. We plan to onboard 50–60 clients in the next 2–3 years. Right now, we’re still in the MVP phase, supporting just a few clients. How should I manage links with so many different systems, such a large number of source tables, and processes that are similar but not identical? The goal is to have one common Information Mart design for all clients to enable standardized reporting.”

This is a classic question in modern data architecture. Let’s explore how to approach Raw Vault, Business Vault, and Information Mart design in a multi-tenant context.



Multi-Tenancy in the Raw Data Vault

A cornerstone principle in Data Vault modeling is that each Satellite is sourced from a single source system. However, in a multi-tenant setup, this guideline needs some adaptation. Many tenants use the same source systems (e.g., Salesforce, SAP) with similar core structures. In such cases, you can load multiple tenants into the same Satellite as long as you introduce a Tenant ID as part of the key.

Why Add a Tenant ID?

  • Ensures uniqueness of business keys across tenants (e.g., Customer 42 in Tenant A ≠ Customer 42 in Tenant B).
  • Partitions data naturally, so Satellites contain subsets per tenant without overwriting each other’s records.
  • Provides a straightforward way to filter or secure records by tenant.

By combining the local business key with the Tenant ID, you create a unique enterprise-wide business key. This guarantees data integrity while simplifying downstream querying and reporting.

Where to Add the Tenant ID

In multi-tenant designs, the Tenant ID should ideally appear:

  • Hubs: As part of the business key or alternate key, ensuring uniqueness across tenants.
  • Links: As part of the Hub references, ensuring uniqueness in combined relationships.
  • Satellites: As a payload field for convenience, even if the hash key already includes the Tenant ID.

With this approach, every record in the Raw Vault can always be traced back to a specific tenant, which simplifies not only modeling but also governance and security.

Defining the Tenant ID

A natural question arises: what exactly is a “tenant”? The answer depends on your business context:

  • It could be a client organization you serve.
  • It could be a business unit, country, or factory in large enterprises.
  • It might also be defined by data ownership—who is responsible for the dataset.

In some cases, you may also need a reserved Tenant ID for global or shared data that is not owned by any specific tenant. This ensures consistency and supports role-based access control.

Staging and Tenant Assignment

The Tenant ID is typically introduced already in the staging layer. How it’s assigned depends on the source system:

  • Tenant-dedicated systems: Assign a constant Tenant ID for all data from that system.
  • Multi-tenant systems (e.g., SAP, Salesforce): Extract and map the Tenant ID from existing fields (e.g., business unit, org ID).
  • Global systems: Use a reserved Tenant ID (e.g., “GLOBAL”) when ownership is shared or unclear.

This is a hard rule (constant assignment), not a conditional transformation, which ensures repeatability and traceability.

Business Vault in Multi-Tenant Contexts

Once Tenant IDs are embedded in the Raw Vault, the Business Vault becomes much easier to design. Business rules can be applied consistently across tenants, while preserving tenant-specific contexts.

  • Same-as Links: Crucial for resolving duplicate entities across tenants (e.g., the same customer appears in different client systems).
  • Custom Satellites: Standardize where possible, but add additional Satellites for tenant-specific customizations.
  • Wide PIT Tables: Be prepared for them—multiple tenants and diverse source systems naturally lead to broader structures.

At this stage, the goal is harmonization without oversimplification. A balance must be struck between common modeling and tenant-specific flexibility.

Designing the Information Mart

The Information Mart is where tenants—or the enterprise as a whole—derive insights. The challenge is to provide both:

  • Enterprise-wide views: Merging data from all tenants for global reporting.
  • Tenant-specific views: Allowing clients or business units to see only their data.

Common Mart Design

A single common dimensional model for all tenants reduces development overhead and supports standardized reporting. By including the Tenant ID in dimensions and facts, you can apply row-level security to restrict access per tenant.

When Separate Marts Are Needed

In some cases, specific tenants may require custom Information Marts. This is typically justified when:

  • Unique KPIs or processes cannot be expressed in the common model.
  • Legal or contractual reasons require strict separation of data.

However, these should remain exceptions. A well-designed common mart, filtered by Tenant ID, is usually sufficient for most tenants.

To unify data across systems and tenants, Same-as Links are critical. These resolve entity duplicates across different tenants and systems (e.g., a product appearing under different codes in SAP and Salesforce).

Same-as Links can be sourced from:

  • Raw data: Mapping tables provided by business or source systems.
  • Calculated logic: Fuzzy matching, soundex, or other deduplication algorithms.

This harmonization enables the creation of enterprise-wide dimensions that span multiple tenants.

Security and Governance in Multi-Tenant Data Vaults

By embedding Tenant IDs throughout the model, row-level security becomes straightforward. Each record can be tied to a tenant, and access can be granted or denied accordingly. This simplifies compliance with data privacy regulations and contractual obligations.

Governance practices should also establish clear rules for:

  • Defining and maintaining Tenant IDs.
  • Managing ownership of global vs. tenant-specific data.
  • Regular audits of access controls and Same-as Links.

Best Practices for Multi-Tenant Data Vaults

  1. Add Tenant IDs early: Introduce them in staging to ensure consistency across the pipeline.
  2. Unify where possible: Standardize Satellites for common structures, customize only when necessary.
  3. Reserve global IDs: Create special identifiers for shared or unclear ownership data.
  4. Secure with Tenant IDs: Use row-level security tied directly to the Tenant ID field.
  5. Leverage Same-as Links: Resolve duplicates to support enterprise-wide reporting.
  6. Design one common mart: Rely on row-level filtering instead of duplicating models per tenant.
  7. Scale incrementally: Start with MVP, refine the model as you onboard new tenants.

Conclusion

Multi-tenant Data Vault design requires careful thought about uniqueness, ownership, and harmonization. By embedding Tenant IDs consistently across Hubs, Links, and Satellites, you not only preserve data integrity but also simplify governance and security. The Business Vault and Information Mart can then be designed to support both tenant-specific and enterprise-wide perspectives.

As organizations grow and onboard more clients or business units, this approach ensures scalability without overwhelming complexity. With clear governance, Same-as Links, and standardized mart designs, you can build a robust multi-tenant data warehouse that serves diverse needs while staying maintainable and secure.

Watch the Video

Meet the Speaker

Profile picture of Michael Olschimke

Michael Olschimke

Michael has more than 15 years of experience in Information Technology. During the last eight years he has specialized in Business Intelligence topics such as OLAP, Dimensional Modelling, and Data Mining. Challenge him with your questions!

The Data Vault Handbook

Build your path to a scalable and resilient Data Platform

The Data Vault Handbook is an accessible introduction to Data Vault. Designed for data practitioners, this guide provides a clear and cohesive overview of Data Vault principles.

Read it for Free

Leave a Reply

Close Menu