The dbt tests I write first for business-critical models

I start here only after two earlier questions are settled: the source contract exists, and the source freshness check is green.

Then the question narrows: which model-level dbt tests would change my first response if the model started lying?

Freshness is a separate source check, not one of dbt’s four built-in generic data tests.

Once that gate is green, I want the smallest model-level test set that blocks the failures most likely to break joins, counts, statuses, or planner-facing quantities.

On a business-critical model, the first ladder should catch duplicate rows, broken parent joins, invalid states, and one model-specific rule before the dashboard conversation starts.

Problem

Imagine fct_purchase_order_lines feeds an operations dashboard that planners use to chase late supplier deliveries. The source lands on time. The model still builds. The dashboard still renders.

A source-system fix quietly changes three things at once.

A retry path duplicates some purchase_order_line_id values, some rows keep a purchase_order_id missing from the header model, and one status mapping starts writing reopened.

None of that requires a broken DAG to create a business problem.

The failure mode I see is treating tests like a generic checklist instead of ordering them around the next decision.

On a business-critical model, the first tests should tell me quickly whether counts, joins, and decision states are still safe.

Default approach

Start after the source freshness check is green.
Add the uniqueness check that enforces the declared grain first. If the grain is composite, expose a stable surrogate key and test that, or write a singular data test against the full key.
Add not_null only on fields that would break a real decision path if they disappeared, such as the parent key, quantity, effective date, or business status.
Add relationships where an orphaned record would create a business-facing mismatch between the model and the parent entity.
Add one accepted_values check or one custom singular data test for the business-state or range rule most likely to drift without breaking the SQL.
Add one custom business-rule test for the highest-risk scenario the built-ins still miss, then stop before the suite turns into noise.

Example

This is the compact test ladder I would want on a planner-facing purchase-order-line model after the source freshness gate is already green:

models:
  - name: fct_purchase_order_lines
    columns:
      - name: po_line_grain_key
        data_tests:
          - unique
          - not_null

      - name: purchase_order_id
        data_tests:
          - not_null
          - relationships:
              arguments:
                to: ref('fct_purchase_orders')
                field: purchase_order_id

      - name: line_status
        data_tests:
          - not_null
          - accepted_values:
              arguments:
                values: ['open', 'partial', 'closed', 'cancelled']

      - name: open_quantity
        data_tests:
          - not_null

That order is deliberate.

unique on po_line_grain_key goes first because one duplicated line can inflate open quantity, duplicate joins, and make planners think more material is still outstanding than it is.
not_null on purchase_order_id, line_status, and open_quantity comes next because those fields decide whether the row can be joined, interpreted, or acted on.
relationships on purchase_order_id earns a slot because orphaned lines create a mismatch between the line model and the header view the business also reads.
relationships excludes NULL values by design, so I only trust it after I have decided whether nulls should fail separately.
accepted_values on line_status comes before a softer shape check because one invalid state can drive the wrong operational response even when the row count still looks normal.

Then I add one model-specific rule the built-ins will not catch:

-- tests/open_quantity_never_negative_for_active_lines.sql
select *
from {{ ref('fct_purchase_order_lines') }}
where line_status != 'cancelled'
  and open_quantity < 0

I keep this first set small on purpose.

If unique fails, I inspect retries, merge logic, or a bad intermediate join. If relationships fails, I inspect parent load timing or the ref boundary. If accepted_values fails, I inspect the latest status-mapping change.

Each early test should narrow the first investigation step.

Tradeoffs

Breaks when the model has no declared grain or stable key yet → Mitigation: go back to the explicit grain note first, then let the first unique test enforce that row meaning.
Breaks when the real grain is composite and the suite checks one convenient column → Mitigation: expose a stable surrogate key or add a model-level assertion that matches the declared grain.
Breaks when teams copy the same null and relationships tests onto every field → Mitigation: keep only the tests that would change the first response on this model.
Breaks when relationships sits on optional or noisy foreign keys and creates alert churn → Mitigation: reserve it for joins where orphaned records create a real business mismatch, and pair it with not_null only when nulls should fail.
Breaks when the built-ins all pass but the model still violates a business rule → Mitigation: add one custom test for the highest-risk scenario, such as negative open quantity or an impossible state transition.
Breaks when the suite grows into dozens of low-value checks because the model is important → Mitigation: rank tests by failure cost and response path, then add depth only where the business risk justifies it.

Close

Next step: Pick one business-critical model, confirm the source freshness gate is green, and write the first four or five dbt tests that would change your first response if the model started lying tomorrow morning.

I’d compare notes on the business-critical model where the test suite keeps growing but the first investigation step still isn’t clear.

The dbt tests I write first for business-critical models

Problem

Default approach

Example

Tradeoffs

Close

Continue reading

The 6-part data contract I want before I trust a source table

Every important model needs an explicit grain