Skip to main content
All Posts By

Sandra Krenn

Multi-Temporality in Data Vault 2.0 – Part 1

By Scalefree Newsletter No Comments

Part 1 – Theoretical understanding of multiple timelines in a data warehouse.

What is “Multi-Temporality” in a Data Warehouse?

Before we start talking about multi-temporality, let’s define the term bi-temporality first, given, a common misconception is that Data Vault 2.0 is only bi-temporal (which is wrong):

“Bitemporal Modeling is a specific case of Temporal database information modeling technique designed to handle historical data along two different timelines. This makes it possible to rewind the information to “as it actually was” in combination with “as it was recorded” at some point in time.” (According to: https://en.wikipedia.org/wiki/Bitemporal_Modeling)

Bi-temporality just addresses two timelines, which are commonly referred to as “System Time” (the technical timeline) and “Valid Time” (the business timeline). Data Vault Satellites, Point-in-Time tables (PIT) and Bridge tables are able to address multiple active timelines in the same record. Let’s categorise just some of them: Read More

Open Source Production Grade Data Integration – Part 2

By Scalefree Newsletter No Comments

Meltano in action

In our last overview, we talked about Meltano and its architecture. Now, we would like to illustrate the ease in which you can use Meltano to create a data integration pipeline.
Before we start, please ensure that you have already installed Meltano on your machine. If you haven’t yet, you can follow Meltano’s official installation guide.

First we will initialize a Meltano project.
Initialize a new project in a directory of your choice by using  “meltano init”.

Read More

Open Source Production Grade Data Integration – Part 1

By Scalefree Newsletter No Comments
In our past blog post, we introduced an open source framework for ELT processes called Singer. This framework can be wrapped up using another open source tool which adds more interesting features to Singer including installation, setup of environments, monitoring, scheduling and orchestration. At Scalefree, we moved all of our ELT pipelines into this framework on AWS and are pleased with the results.

 

Please note, there are a large number of platforms for managing data integration but there is a lack of robust and easy-to-use, free open source solutions. The Meltano project aims to provide a solution to that situation. Meltano is a full-package data integration platform that challenges the most established players in the data space. Meltano is built on top of the best open source tools for data integration and infuses them with DataOps best practices.

Meltano is the easiest way to build, run and orchestrate ELT pipelines made-up of Singer taps, targets and dbt models. It is open source, self-hosted and version controlled as well as containerized.

Meltano’s open source model lets you easily adapt it to your own needs and reduces cost. Read More