How to Actually Test Your dbt Models (and Fix What’s Breaking)

Most dbt tutorials show you how to add a not null or unique test. That’s great- but what happens when those tests fail?

In this post, I’ll walk through how I approached testing my dbt models, how to deal with failing tests, and how to actually make them useful instead of just noise in your terminal.

What Types of Tests Are Available in dbt?

dbt tests fall into three main types:

1. Schema Tests (built-in)

These are the ones you define in your .yml files — like not_null, unique, accepted_values, and relationships.

columns:
   - name: campaign_id
     tests: 
       - not_null
       - unique

These help catch things like missing IDs, duplicate records, or invalid values.

2. Custom Data Tests (written as SQL)

These are great when you want to check for logic errors that aren’t covered by schema tests- like filtering for bad dates or checking metric ratios.

-- tests/no_future_orders.sql
SELECT *
FROM {{ ref('orders') }}
WHERE order_date > CURRENT_DATE

Example: no future dates in your orders

3. Source Freshness Tests

These test whether your source data is being updated as expected and live in your .yml files. Useful for making sure your pipeline is running on schedule.

  - name: mailchimp
    tables:
      - name: campaigns
        freshness:
          warn_after: { count: 24, period: hour }
          error_after: { count: 48, period: hour }

Why Do dbt Tests Fail?

A failing test isn’t necessarily bad- it means your test is doing its job. The key is knowing what to do next.

Here's what a failing test could be telling you:

not_null

A missing or null value in staging- maybe that column doesn’t always exist upstream.

unique

Duplicate records- often due to joins or bad deduping logic.

relationships

Foreign key mismatch- the reference might be incomplete or cleaned differently.

accepted_values

Your lookup table might be missing a value (or the source added something new)

What To Do When a Test Fails

Don't just delete the test. Take a second to figure out where in your pipeline the fix actually belongs. Here’s a better way to think through it:

Example Failure:

columns:
  - name: campaign_id
    tests:
      - not_null

But the test fails- campaign_id has nulls. So do you fix the issue in the source config? In your staging model? Or somewhere else?

Fix it in staging if:

This is the most common place for fixes.

The column is present, but some records have nulls
You need to filter out irrelevant records (ex: drafts, test campaigns)
You need to rename, cast, or restructure the column

Example fix:

-- stg_mailchimp_campaigns.sql
SELECT
  id AS campaign_id,
  ...
FROM {{ source('mailchimp', 'campaigns') }}
WHERE id IS NOT NULL  -- Exclude broken records

Fix it in the source definition if:

The nulls are coming straight from your upstream source (ex: an API or CSV file)
You want to surface the issue early, before it even hits staging
You own or control the source pipeline and want to improve extraction

Tip: This is where you'd add freshness checks, or document expected quirks in the data.

Example fix:

sources:
  - name: mailchimp
    tables:
      - name: campaigns
        columns:
          - name: id
            tests:
              - not_null

Remove the test if:

The nulls are valid and expected
You reviewed it and confirmed it’s not actually an issue
The test was added automatically or without context

Using --store-failures to See What’s Actually Breaking

One of the most useful (and underused) dbt features is:

dbt test --store-failures

Instead of just telling you something broke, it stores the actual failing rows in your database so you can inspect them.

Example:

SELECT *
FROM dbt_test__audit__not_null_my_model_my_column

This helped me find a handful of edge-case records that were breaking my tests- things I’d never have seen in the raw logs alone.

Author:

Erin Potter

View Profile