Skip to main content
search
0
Scalefree Knowledge Webinars Expert Sessions dbt Talk dbt Fusion Demo: Dialect-Aware Validation, State-Aware Orchestration & Efficient Testing

dbt Fusion Demo

This is the second installment in our dbt Expert Series. In the first video, we introduced dbt Fusion, explored what it is, why it matters, and highlighted its core capabilities: dialect error validation, state-based orchestration, and efficient testing. If you have not watched that video yet, we recommend doing so before continuing here.

In this session, Scale Free principal consultant Metropolis takes those concepts into a live demo environment to show how they actually behave in practice. The result is a clear picture of how dbt Fusion can reduce costs, eliminate redundant work, and catch errors before they become expensive problems.



The demo project: Hub Speak Base

To ground the demo in something realistic, Metropolis built a project called Hub Speak Base. It includes four data sources — customer, line item, orders, and part — along with seven models: one staging model per source, two dimension models (customer and part), and a fact orders model. Unique and not-null tests are configured on both the sources and the staging models, and the mart models include unit tests defined in a star schema YAML file. This gives a solid foundation for demonstrating all three major Fusion capabilities without abstracting away the messiness of real-world pipelines.

Dialect error validation in the IDE

One of the most immediately useful features of dbt Fusion is its ability to catch SQL errors before a query ever leaves the IDE. This is what Fusion calls dialect error validation, and the demo shows it working in two distinct scenarios.

First, Metropolis demonstrates column reference checking. In the fact orders model, he intentionally references a column called order_keys instead of the correct order_key. Fusion flags this instantly — hovering over the incorrect reference surfaces an error message explaining that the column l_order_keys cannot be found. The same error appears in the problems panel below the editor, making it impossible to overlook.

Second, he tests function name validation by changing the round function to a fictitious roundy. Fusion flags this as well. When he attempts to build the model anyway, Fusion does not even send the query to Snowflake. Instead, it stops during static analysis and throws an error immediately — saving both time and warehouse compute.

It is worth noting a current limitation: as of the time of recording, not all SQL errors are caught by Fusion’s static analysis engine. For example, passing a third argument to the round function — which Snowflake does not support — is not yet flagged locally, and the query is sent to Snowflake where it fails at runtime. Since Fusion is still in preview, this behavior is expected to improve over time.

Master dbt for Scalable Data Platforms

Learn how to design, build, and operate production-ready dbt projects with proven best practices.

Join the dbt Training

State-based orchestration: only rebuild what changed

State-based orchestration is where dbt Fusion offers some of its most compelling cost savings. The idea is simple: instead of rebuilding every model on every run, Fusion tracks the state of each model in the database and only rebuilds the ones that have changed — whether due to code updates or upstream data changes.

To enable this, you navigate to the Orchestration settings for your production environment and toggle on the Fusion cost optimization features, which includes both state orchestration and efficient testing.

The demo makes the behavior concrete. Metropolis drops all tables and views from the production schema, then triggers a job. Every model is rebuilt from scratch because Fusion detects that nothing exists yet. On the second run, with no changes made, every model is marked as reused. The logs confirm this clearly: no new changes on any upstream model.

Then things get interesting. Metropolis inserts one row into the customer source table and one row into the line item source table. The expected behavior — that only the models downstream of those sources would be refreshed — is exactly what happens. The staging customer model and the dimension customer model are rebuilt. The staging line item model and the fact orders model are rebuilt. Everything else is reused. Fusion is detecting data changes at the row level, without any update timestamps configured. The orchestration works automatically out of the box.

One particularly useful aspect of Fusion’s state orchestration is that state is shared across jobs within the same environment. The demo includes a second production job configured to refresh fact orders and its upstream dependencies. When this job runs after the first, it finds all models already built and up to date — so everything is reused. Teams running multiple jobs against the same environment avoid paying twice for the same compute.

Efficient testing: stop running the same tests twice

The third capability demonstrated is efficient testing. When using the build command in a Fusion-enabled job, dbt Fusion tracks which tests have already run and reuses their results for downstream models within the same job execution — rather than re-running identical tests multiple times.

In the demo, after switching the job command from run to build, the results show tests on the sources executing as expected. But the equivalent tests defined on the staging and mart models — tests that reference the same underlying data — show their results as reused. This avoids redundant warehouse queries and can meaningfully reduce both execution time and compute cost on larger projects.

The current limitation here is scope: as of the time of recording, test result reuse only happens within the context of a single job run. Results are not carried over to subsequent runs or shared across different jobs. This may change in future versions of Fusion.

What this means for your dbt workflows

Taken together, these three capabilities address real pain points that dbt teams encounter as their projects scale. IDE-level error validation shortens the feedback loop between writing SQL and knowing it works, without requiring a round-trip to the warehouse. State-based orchestration dramatically reduces unnecessary compute by treating rebuilds as the exception rather than the rule. And efficient testing ensures that the tests you have invested time in writing do not become a bottleneck in CI/CD by running redundantly.

dbt Fusion is still in preview, and some capabilities are still being refined. But based on this demo, the direction is clear: Fusion is designed to make the full dbt development and deployment cycle faster, cheaper, and more intelligent.

Future videos in this series will cover more advanced configuration options for state orchestration, as well as additional Fusion features as they become available. Make sure you are subscribed so you do not miss them.

Master dbt for Scalable Data Platforms

Learn how to design, build, and operate production-ready dbt projects with proven best practices.

Join the dbt Training

Watch the Video

Leave a Reply

Close Menu