Skip to main content
search
0

Data Mining in the Data Vault Architecture

Watch the Video

In our ongoing series, CEO Michael Olschimke addresses a viewer’s question:

“We have a data mining model to be applied during information delivery. Where does it fit in the Data Vault 2.0 architecture?”

The viewer inquires about integrating a data mining model into the Data Vault 2.0 architecture specifically for information delivery. They seek guidance on where this data mining aspect fits within the broader Data Vault framework.

Michael delves into the topic of Data Mining in the context of Data Vault 2.0. He provides insights into the strategic placement of data mining models within the architecture, emphasizing their role in enhancing information delivery processes. Michael’s response sheds light on how organizations can effectively leverage data mining techniques to extract valuable insights while adhering to the principles of the Data Vault methodology.

This episode serves as a valuable resource for those navigating the intersection of data mining and Data Vault 2.0, offering practical guidance on seamlessly integrating data mining models into the architecture.

Data Vault Naming Conventions

Watch the Video

In our continuous series, CEO Michael Olschimke addresses a viewer’s question reading naming conventions in Data Vault:

“What naming conventions do you recommend for the Data Vault model?”

The viewer seeks advice on the recommended naming conventions for structuring the Data Vault model. Recognizing the significance of clear and standardized naming in data modeling, the question focuses on eliciting practical insights and guidelines.

Michael shares his expertise on effective naming conventions tailored for Data Vault models. He emphasizes the importance of consistency, clarity, and meaningful names to enhance the comprehensibility and maintainability of the Data Vault structure. By providing practical recommendations, Michael aids viewers in establishing robust naming conventions aligned with best practices in Data Vault modeling.

This episode serves as a valuable resource for data professionals aiming to optimize their Data Vault models through well-defined and organized naming conventions.

Agile Data Warehousing: Addressing the Hard Problems

Watch the Webinar

The world moves at a rapid pace, and your organization must be able to respond to changing conditions. Your data warehouse (DW) team is being asked to help end users answer new questions to gain new insights.

These requests are coming in at an increasing pace and are increasingly complex. Your team(s) need to adopt an agile data warehousing strategy, but are struggling to address common challenges when trying to do so.

In this session Scott Ambler addresses a series of difficult questions that DW practitioners need answers to if they are to learn how to work in a work in an agile manner

Watch Webinar Recording

Modelling the Date Dimension in Data Vault

Watch the Video

In our continuous series, CEO Michael Olschimke delves into a question from the audience about how to model the date dimension in Data Vault:

“In many data sources we get data with a DATE data type. In some cases we want to use a Time-Dimension for this fields. How would you model this in Data Vault:

  •  As Time-Hub in Raw Vault and referencing that Hub in a Link?
  • As Time-Reference Table and then joining that in the IM? Should the Time Dimension hold a Hash Key as Dimension Key for that, or the Business Key (date)?
  • Or both options?”

The viewer raises a pertinent query regarding the modeling of DATE data types from various sources within the Data Vault modeling framework. The focus is on incorporating a Time-Dimension for these date fields, presenting multiple options for consideration.

Michael explores potential solutions, shedding light on two prominent strategies:

Time-Hub in Raw Vault: Creating a dedicated Time-Hub in the Raw Vault and referencing it in a Link. This approach involves establishing a distinct hub for time-related data, providing a structured foundation for subsequent processing.

Time-Reference Table in Information Mart (IM): Alternatively, considering a Time-Reference Table in the IM and joining it as needed. The discussion delves into the nuances of choosing between a Hash Key and Business Key (date) for the Time Dimension, offering insights into the implications of each choice.

Michael’s insights provide valuable guidance for navigating the complexities of modeling date dimensions within the Data Vault paradigm. By weighing the pros and cons of different approaches, viewers gain a deeper understanding of how to effectively integrate time-related data into their Data Vault architecture.

How to Get Rid of Data Vault Load End Date

Watch the Video

In our continuous series, CEO Michael Olschimke addresses a question from the audience regarding the Load End Date in Data Vault:

“What are the options to virtualize the load end date in Data Vault? We work on an embedded solution and the Window function (LEAD/LAG) is not fast enough.”

The viewer raises a critical concern regarding the virtualization of the load end date in a Data Vault environment, especially in the context of an embedded solution. The challenge lies in the performance limitations of certain window functions like LEAD/LAG, prompting the exploration of alternative options.

Michael delves into potential strategies and solutions for efficiently managing load end dates in Data Vault. The discussion encompasses various approaches to enhance virtualization, ensuring optimal performance without compromising speed or efficiency.

By sharing insights into the intricacies of load end date virtualization, Michael provides valuable guidance for organizations grappling with embedded solutions. The exploration of alternatives offers a nuanced perspective on optimizing this crucial aspect within the Data Vault framework.

Salesforce Sales Cloud Standard Data Model: Exploring the Fundamentals

Organize Your Customers

Summary

Gain a comprehensive understanding of the Salesforce Sales Cloud Standard Data Model through our informative video guide. Delve into critical objects and their relationships using the Schema Builder, acquiring insights into the foundational structures that drive effective sales management. Tailored for Salesforce Sales Cloud beginners, this resource equips you with knowledge to optimize your data model, establishing a robust foundation for successful customer engagement.


Introduction to Salesforce Sales Cloud Standard Data Model

Embark on a journey into the core of Salesforce Sales Cloud as we unveil the intricacies of the Standard Data Model. This video meticulously highlights essential objects, emphasizing their crucial roles in constructing a coherent and scalable framework for managing sales processes.


Key Highlights

Objects Overview: Uncover the significance of pivotal objects such as Accounts, Contacts, Opportunities, and Leads in shaping your sales ecosystem.

Schema Builder Visualization: Experience the power of Schema Builder as we visually illustrate the interconnections between different objects, offering a clear and intuitive grasp of your data model.

Importance of Standard Data Model: Delve into the critical role played by the Sales Cloud Standard Data Model, providing a standardized and scalable approach to organizing customer information and sales activities.


Target Audience

This video is tailored for individuals entering the realm of Salesforce Sales Cloud. Whether you’re a newcomer or seeking a refresher, our content simplifies the complexities of the standard data model. Beginners will appreciate the step-by-step exploration, while seasoned users may gain new insights into optimizing their Salesforce setup.

Watch the Video

Mastering Salesforce Lead Conversion: A Quick Guide for Beginners

Your Lead Has Been Converted

Summary

Unlock the secrets of Salesforce Lead Conversion in our comprehensive video guide. Dive into essential activities, from call logging to task creation, and discover the pivotal role of effective lead conversion in maximizing sales efficiency. Tailored for beginners in Salesforce Sales Cloud, this resource empowers you to optimize your sales pipeline, enhance customer engagement, and propel your business to new heights. Elevate your strategy with our expert insights for sustained success.


Introduction to Lead Conversion Process in Salesforce

Welcome to an insightful exploration of the Lead Conversion process in Salesforce, a pivotal element in the realm of sales and customer relationship management. This video serves as a comprehensive guide, demystifying the intricacies of converting potential leads into valuable opportunities.


Overview of Lead Conversion

In the dynamic landscape of sales, converting leads is a critical step towards transforming prospects into customers. The Lead Conversion process in Salesforce is a strategic approach that streamlines this transition, ensuring that no valuable information is lost in the conversion journey. From initial contact to closing deals, every step is meticulously orchestrated to maximize efficiency and effectiveness.


Understanding Salesforce Activities

Our video delves into the various activities that play a crucial role in managing leads. From logging calls to creating tasks and tracking interactions, Salesforce offers a robust set of tools to streamline communication and ensure that every lead is nurtured effectively.


Why it Matters

Effective lead conversion is the lifeblood of successful sales operations. It is not merely a checkbox exercise but a dynamic process that fosters a deeper connection with your potential customers. By understanding and optimizing the Lead Conversion process, businesses can enhance their sales pipeline, improve customer engagement, and ultimately drive revenue growth.

We demystify the Lead Conversion process, empowering you to unlock the full potential of your sales endeavors within the Salesforce ecosystem. Elevate your sales strategy, engage leads effectively, and propel your business towards sustained success.


Target Audience

This video caters to a diverse audience, particularly those navigating the Salesforce Sales Cloud for the first time. Beginners in the Salesforce ecosystem will find valuable insights into the Lead Conversion process, gaining a solid foundation to harness the full potential of Salesforce Sales Cloud. Whether you’re a sales professional, business owner, or someone curious about optimizing lead management, this video provides a user-friendly entry point into the world of Salesforce.

Watch the Video

Column Store Compression vs Page Compression

Watch the Video

In our ongoing series, Michael Olschimke, our CEO, delves into a viewer’s question:

“During boot camp, Michael advises switching on page compression on the MS SQL server platform. Now when the platform supports column compression (via column store indexes) would you advise it instead? Do you have any project experience with it?”

The viewer seeks Michael’s insights on a specific query that emerged during a boot camp session. The question revolves around the recommendation to enable page compression on MS SQL Server and whether, with the advent of column compression via column store indexes, Michael would now advise using the latter. The viewer is particularly interested in understanding Michael’s project experiences related to this choice.

In response, Michael shares his expertise on the nuances of compression techniques in the context of MS SQL Server. He discusses the merits of both page compression and column store indexes, offering practical insights into their application based on project experiences. This session provides valuable guidance for optimizing storage and performance in the MS SQL Server platform, exploring the trade-offs and benefits of each compression approach.

For those navigating the complexities of compression choices within their SQL Server environment, Michael’s knowledge in this episode offers clarity and actionable recommendations, making it a valuable resource for Data Vault practitioners.

Agile Data Warehousing: An Introduction

Watch the Webinar

This webinar overviews a disciplined, hybrid approach to agile data warehousing.

This proven methodology is mostly agile, adopting great ideas from lean and traditional WoW to address the shortcomings of a pure agile approach. Data warehouse (DW) teams work differently than software development teams because they face fundamentally different challenges.

As a result, methods that work well for agile software development prove insufficient for building and evolving DWs. Learn how agile DW works in practice, working through the full lifecycle from beginning to end and then back again.

Watch Webinar Recording

Capturing CDC Flags in Data Vault

Watch the Video

In our ongoing series, Michael Olschimke, our CEO, engages with a viewer’s question:

“If you get CDC data and insert/update/delete flag, then would you split it to separate satellite table or leave it on regular sat?”

The specific query revolves around handling CDC (Change Data Capture) data, particularly when faced with insert/update/delete flags. The viewer seeks guidance on whether to split this data into separate satellite tables or retain it within a regular satellite.

In response, Michael addresses the intricacies of managing CDC flags within the Data Vault framework. He explores the pros and cons of both approaches, providing valuable insights into the considerations that influence the decision-making process. This session offers practical advice on structuring satellite tables to effectively capture and utilize CDC flags in a Data Vault environment.

For those grappling with the nuances of integrating CDC data seamlessly into their Data Vault setup, Michael’s expertise provides clarity and actionable recommendations, making this episode of Data Vault Insights a valuable resource.

The Potentials of Microsoft Fabric for Business Intelligence Solutions

Microsoft Fabric

Microsoft Fabric for Business Intelligence

Microsoft Fabric is an all-in-one cloud-based analytics platform that provides a unified environment for data professionals and business users to collaborate on data solutions. It is a powerful analytics platform that helps businesses automate workflows, improve productivity, and gain insight from their data.

In today’s data-driven world, businesses are increasingly turning to business intelligence (BI) solutions to gain insight from their data. BI solutions can help businesses improve their decision-making, optimize their operations, and gain a competitive edge.

The Potentials of Microsoft Fabric for Business Intelligence Solutions

Microsoft Fabric is a new and innovative data analytics platform that has the potential to revolutionize the way businesses make decisions. With Fabric, organizations can ingest, store, process, and analyze all of their data in one place, using a unified set of tools and services. This makes it possible to get insights from data faster and easier than ever before. We will explore the potential of Microsoft Fabric for business intelligence solutions. We will discuss how Fabric can be used to improve data quality, streamline analytics workflows, and deliver insights to users. This webinar is interesting for everyone wanting to learn more about how to use Microsoft Fabric to get more value from data. And even more interesting for Data engineers and analysts, Business intelligence professionals, and IT decision-makers. Register now for this free webinar and learn how Microsoft Fabric can help you take your business intelligence solutions to the next level!

Watch webinar recording

What to expect

This newsletter will take you on a journey to discover the transformative power of Microsoft Fabric for business intelligence solutions. You will explore the various workloads and experiences that Microsoft Fabric offers, gaining a comprehensive understanding of its capabilities. You will also uncover the benefits that Microsoft Fabric brings to data-driven decision-making, enabling you to make informed choices that propel your business forward. Finally, you will delve into the potential that Microsoft Fabric holds for enhancing business intelligence solutions, empowering you to unlock new levels of insight and decision-making capabilities.

To dive even deeper, watch the webinar recording about this topic for free. Click here to register.

What is Microsoft Fabric?

Microsoft Fabric is an all-in-one cloud-based analytics platform that provides a unified environment for data professionals and business users to collaborate on data solutions. Fabric offers a suite of integrated services that enable you to collect, store, process, and analyze data in a single platform, which is built on a foundation of Software-as-a-Service (SaaS).

Microsoft Fabric provides tools for people with all levels of data expertise and connects with the tools businesses use to make decisions.
Fabric itself is an umbrella for the following Microsoft cloud-based services that constitute Microsoft Analytics Portfolio:

  • Azure Data Factory
  • Azure Event Hubs
  • Azure Data Explorer
  • Azure Artificial Intelligence
  • Azure Databricks
  • Azure Synapse Spark Pools
  • Azure Synapse Analytics
  • and Microsoft Power BI

The above items have been re-tooled and taken to the next level inside the Microsoft Fabric.

Workloads of Microsoft Fabric

Fabric includes the following workloads or experiences:

  • Data integration
  • Data Engineering
  • Data Warehousing
  • Data Science
  • Real-time analytics
  • Business intelligence
  • Insight to action

The foundation of these experiences in Fabric is the data lake, which is known as OneLake. The below picture illustrates the concept of Microsoft Fabric.

Microsoft Fabric

Let’s review every component of Fabric in a bit more detail.

Data Integration

Microsoft Fabric’s data integration workload, called Data Factory, brings data movement capabilities to both dataflows and data pipelines.

  • Dataflows offer flexible and user-friendly ways to transform data with over 300 transformations. It is built on the familiar Power Query experience, which is available in several Microsoft products and services, such as Excel, Power BI, Power Platform, and others.
  • Data pipelines let you create flexible data workflows to meet your organizational goals. You can use the built-in data orchestration features to refresh your dataflows, process large datasets, and define complex control flow pipelines.

Data Engineering

Data Engineering Experience or Synapse Data Engineering offers a top-tier Spark platform, fostering rich authoring experiences. This empowers data engineers to conduct extensive data transformations and facilitates widespread access to data via the lakehouse.

Microsoft Fabric provides various data engineering capabilities to ensure that your data is easily accessible, well-organized, and of high quality. From the data engineering homepage, you have the following options:

  • Lakehouse
    Create and manage Lakehouse. Lakehouse is a logical location in OneLake where you store and manage structured and unstructured data using various tools and frameworks. You can even mount an external storage account into your Lakehouse with the Shortcut feature.
    You can use the SQL Endpoint to query Lakehouse tables, but only read-only queries are supported.
  • Notebook
    Write and run code in popular programming languages, like Python, R, and Scala. Leverage notebooks for data ingestion, transformation, analysis, and other data processing tasks.
  • Environment
    Within an environment, you have the flexibility to choose from a variety of Spark runtimes, configure your computational resources, and incorporate libraries – either from public repositories or by uploading locally-built custom libraries. Attaching these environments to your notebooks and Spark job definitions is a seamless process.
  • Spark Job Definition
    Define, schedule, and manage Spark jobs to process big data in your lakehouse, apply transformation logic to the data, and more.
  • Data Pipeline
    Design and orchestrate pipelines to copy data into your lakehouse, schedule Spark jobs and notebooks to process high-volume data, and automate data workflows via integration with Data Factory.

Data Warehousing

Data warehouse or Synapse Data Warehouse is a lake-centric repository for storing and analyzing structured data built on a distributed processing engine. Data warehousing workload benefits from the rich capabilities of the SQL engine over an open Delta Lake format, which are parquet files published as Delta Lake Logs and stored in OneLake. Delta Lake Logs enable ACID transactions. A warehouse can contain only structured data. Here you can not only read data with SQL but also execute inserts and updates.

Data Science

The Data Science Experience, or Synapse Data Science, allows for the creation, deployment, and operationalization of machine learning models. Data scientists are empowered to enrich organizational data with predictions and allow business analysts to integrate those predictions into their BI reports. Data Science in Fabric provides you with the following options:

  • ML model
    Leverage machine learning models to forecast results and identify irregularities within datasets.
  • Experiment
    Engage in the experimentation phase by generating, executing, and monitoring the evolution of various models for validating hypotheses.
  • Notebook
    Utilize the Notebook feature to delve into data exploration and construct machine learning solutions through Apache Spark applications.
  • Environment
    This option has the same purpose as in Data Engineering experience.

Real-Time Analytics

Real-Time Analytics in Fabric or Synapse Real-Time Analytics is a fully managed big data analytics platform optimized for streaming and time-series data. It utilizes a Kusto Query Language (KQL) — an engine with exceptional performance for searching structured, semi-structured, and unstructured data. Real-Time Analytics is fully integrated with the entire suite of Fabric products for data loading, data transformation, and advanced visualization scenarios. There are three components of the Real-Time Analytics in Fabric:

  • KQL Database — the place to store streaming data. It uses OneLake as the underlying storage system.
  • KQL Queryset — run queries on your data to produce shareable tables and visuals. Save, manage, export, and share KQL queries. KQL Queryset is an analog of SSMS for a SQL Database.
  • Eventstream — capture, transform, and route real-time event stream to various destinations in the desired format with no-code experience. It is a hub of streaming data, where multiple sources (including Event Hub) and various destinations (including KQL database) can be set.

Business Intelligence

From the BI side, Fabric has Power BI, one of the world’s leading Business Intelligence platforms. It represents a set of services, tools, and connectors that turn your data into interactive visual reports and dashboards. In Fabric, there are some new features and enhancements to work with Fabric objects and experiences. Among these are:

  • Direct Lake mode connection that is based on loading parquet-formatted files directly from a data lake without the need to import the data into a data set.
  • Integration with Synapse Real-Time Analytics to produce real- or near-real-time reports.
  • Semantic link that allows loading data from Power BI data sets into Data Science experience and others.

Insight to Action

Insight to action in Fabric offers Data Activator, which is an experience for automatically triggering actions when certain conditions are met in the incoming data. It works with Eventstreams for real-time data and Power BI for batch data. There are three possible actions that can be configured after the conditions are detected:

  • Email — get notified by email.
  • Teams message — send a notification to an individual or a channel in Teams.
  • Custom action — perform a custom action to call Power Automate workflow.

Data Lake

OneLake is the central component of all Fabric services. It is a built-in, unified data lake that stores all organizational data and is used by all Fabric experiences. OneLake uses ADLS Gen2 as its underlying storage.

To simplify management across the organization, OneLake is organized hierarchically. Each tenant has only one OneLake instance, which provides a single namespace that extends over users, regions, and even clouds. For easy handling, data in OneLake is divided into manageable containers.

Similar to Microsoft OneDrive, any developer or business unit in the tenant can create their own workspaces in OneLake. They can ingest data into their own lakehouses, and start processing, analyzing, and collaborating on the data. All Fabric experiences are bound to and operate on top of OneLake.

So What Are Microsoft Fabrics Potentials?

Microsoft Fabric has the potential to revolutionize the way that businesses build and deploy business intelligence solutions. Here are some of the key potentials:

  • Unified data platform: Microsoft Fabric provides a unified data platform for storing and analyzing all types of data, including structured, semi-structured, and unstructured data. This means that businesses can use a single set of tools and services to manage all aspects of their data pipeline, from data ingestion and preparation to data analysis and visualization. This can help to reduce complexity and improve efficiency.
  • End-to-end analytics: Microsoft Fabric provides a complete set of capabilities for building and deploying end-to-end analytics solutions. This includes data engineering, data science, data warehousing, and business intelligence. This simplifies the development and deployment of analytics solutions and reduces the need for specialized expertise.
  • AI-powered insights: Microsoft Fabric embeds AI and machine learning capabilities throughout the platform. This can be used to automate tasks, generate insights, and improve the accuracy of predictions. This can help businesses to make better decisions faster.
  • A unified platform accommodating diverse expertise: Data professionals with varying backgrounds can leverage their skill sets to collaborate on a singular data solution. This inclusive environment allows individuals from different disciplines and proficiency levels to contribute their expertise to a common data project.
  • Reduced data silos: It eliminates the need to move data between various systems, making it easier to get a complete view of your business.
  • Reduced costs: Microsoft Fabric can help businesses reduce the costs associated with building and deploying business intelligence solutions. This is because it eliminates the need to purchase and maintain multiple systems and it provides a number of features that can automate tasks and improve efficiency.

In essence, Microsoft Fabric holds the promise of simplifying and rendering the development and deployment of robust business intelligence solutions more accessible and cost-effective for businesses, regardless of their scale or size.

Conclusion

Microsoft Fabric represents a paradigm shift in the realm of business intelligence solutions. With its unified data platform, end-to-end analytics capabilities, and AI-driven insights, Fabric streamlines data management from collection to visualization. The platform’s ability to cater to diverse skill sets and backgrounds, reducing data silos, and automating tasks illustrates a remarkable potential to revolutionize the industry.

Through its components, such as Data Factory, Data Engineering, Data Warehousing, Data Science, Real-Time Analytics, Power BI, Data Activator, and the robust foundation of OneLake, Microsoft Fabric promises a future where businesses can efficiently harness the power of data to make informed decisions and drive innovation. Ultimately, it has the potential to democratize powerful business intelligence solutions, making them more accessible and cost-effective for enterprises of all scales.

Close Menu