Skip to main content
search
0
Scalefree Knowledge Webinars Data Vault Friday Defining the Error Mart in Data Vault

Defining the Error Mart

When working with data platforms that follow the Data Vault methodology, one often hears about components like the Raw Vault, Business Vault, and Information Marts. But among these well-known layers is a lesser-discussed yet critical structure: the Error Mart.

In this blog post, we take a comprehensive look at what an Error Mart is, what its main objectives are, and the best practices for designing one. This insight is based on an informative session led by Michael Olschimke, CEO of Scalefree, during a recent Data Vault Friday.



What is an Error Mart?

In traditional data warehousing approaches like Kimball, an Error Mart is used to store metrics about errors—for example, how many ETL jobs failed or which tables didn’t load successfully. These are primarily KPIs used for monitoring and are typically stored in what’s known as a Metrics Mart.

However, in the context of Data Vault 2.0, the Error Mart has a different, more tactical role: it acts as a catch-all for rejected records that fail to load during any of the staging or integration processes.

This could be due to a mismatch in expected data types, missing columns, or unexpected structural changes in the source data. These issues most frequently arise during:

  • Initial data ingestion from files, APIs, or real-time feeds
  • Loading data into the staging area or raw Data Vault
  • Applying hard rules based on schema assumptions

The Main Goal of an Error Mart

The primary goal of the Error Mart is to ensure that all incoming data—the good, the bad, and the ugly—is captured and traceable, even if it can’t immediately be loaded into the intended layer (such as the Raw Vault).

It’s a technical safety net that provides:

  • A secure location for rejected records
  • The ability to analyze and correct issues manually
  • A reprocessing workflow that ensures full data capture

The Error Mart is not meant for business logic errors (e.g., someone underage purchasing a product); rather, it handles technical discrepancies that prevent data from moving through the pipeline.

How Is It Structured?

Traditionally, one might think of creating multiple error tables to match each data model. However, Michael Olschimke recommends a single flexible structure—a table that stores rejected records as JSON strings. This allows you to capture various unexpected formats without predefined schemas.

Each record should be accompanied by key metadata:

  • Load date – Timestamp of ingestion
  • Record source – Source system or interface
  • Process identifier – The job or transformation that failed

This setup ensures that every error is auditable, traceable, and eventually resolvable.

Best Practices for Designing an Error Mart

Here are some key considerations when building your Error Mart:

1. Flexibility in Structure

Since rejected data often doesn’t conform to expected schemas, use a structure that can handle variability. A single table using JSON or Parquet formats offers great flexibility, especially when stored in a data lake.

2. Avoid Over-Engineering

There’s no need to create one table per error type. One well-documented and meta-tagged table is usually sufficient.

3. Logging and Auditing

Implement a companion log table or file to track which records have been reprocessed. Instead of deleting processed error records, use a status flag or separate tracking log to preserve data lineage and maintain transparency.

4. Trigger Monitoring and Alerts

Your system should monitor the Error Mart for unprocessed records. Set up alerts via email, log monitoring tools like CloudWatch or Greylog, or build dashboards that notify the data team when action is required.

5. Make It the Data Team’s Responsibility

A critical mindset shift: processing records in the Error Mart is not a business responsibility—it’s yours as the data engineering team. Do not offload this to end users.

6. Reprocessing Workflow

Once the technical root cause is identified (e.g., an overly strict field length), update the hard rules, reload the rejected data from the Error Mart into the target layer, and mark it as processed in your log.

7. Error Mart in Every Layer

While most errors occur in the initial stages (staging, Raw Vault), you should prepare to capture errors at every layer—Business Vault and Information Mart included.

8. Binary Data Considerations

If your incoming data includes blob fields, you can mime-encode them and store them alongside the error JSON or separately in the data lake.

Why the Error Mart Matters in Data Vault Architecture

Data Vault is built on the premise of complete and auditable data capture. To meet this principle, you must have a strategy for handling unexpected or failed data loads. The Error Mart acts as that strategy.

It’s not just a dumping ground for bad records—it’s a crucial feedback mechanism that helps you refine your ingestion and transformation rules, ensuring every piece of data, no matter how ugly, makes it into the platform.

Without an Error Mart, you risk data loss, broken lineage, and ultimately, lower trust in your data platform.

Conclusion

In summary, the Error Mart is an essential part of a resilient Data Vault architecture. It gives your data team the tools to identify, correct, and reprocess problematic data while maintaining auditability and trustworthiness.

If you’re implementing a Data Vault, don’t treat the Error Mart as an afterthought. Design it with flexibility, transparency, and process integration in mind. And remember: it’s your job to make sure no record gets left behind.

Watch the Video

Meet the Speaker

Profile picture of Michael Olschimke

Michael Olschimke

Michael has more than 15 years of experience in Information Technology. During the last eight years he has specialized in Business Intelligence topics such as OLAP, Dimensional Modelling, and Data Mining. Challenge him with your questions!

The Data Vault Handbook

Build your path to a scalable and resilient Data Platform

The Data Vault Handbook is an accessible introduction to Data Vault. Designed for data practitioners, this guide provides a clear and cohesive overview of Data Vault principles.

Read the Book Now

Leave a Reply

Close Menu