Row counts are not enough: the checks I add before I trust a pipeline

Row counts are a smoke test, not a pass. I have seen models land at the usual volume while a join key turns null, duplicates creep in, or one business category disappears without tripping the row-count alert.

Before I trust a published model, I want five checks inside the publish path: freshness, row counts, uniqueness, null rate, and one shape check tied to business risk.

These five checks live inside the publish path, so they are a different job from the operating view I watch when a run is already late or stale. If these five checks pass and a number still moves where the business can see it, I switch to When a dashboard number changes, I check these four things first.

If a check cannot change a response, it usually does not earn a slot.

Problem

Imagine fct_shipments still lands about 1.2 million rows after an upstream schema change. At first glance the pipeline looks healthy.

But the source team renamed a warehouse mapping field, the transform still runs, and 22% of rows now carry a null warehouse_id. The row count did its job; it just did not protect the downstream metric.

That kind of miss shows up later as a regional fill-rate problem, a missing warehouse view, or a dashboard nobody trusts. I would rather catch it in the pipeline than explain it in a meeting.

Default approach

Check freshness first: did the extract land on time, did the model finish when I expected, and is the latest successful publish recent enough to trust?
Check row counts next to rule out a big volume shift. If the count looks normal, keep going. Stable volume is not a pass when keys can go null, duplicates can creep in, or one category can quietly disappear.
Check uniqueness on the keys that drive downstream joins, mappings, or counts.
Check null rate on fields that would break joins, mappings, or business logic if they go missing.
Check one accepted-values or distribution rule on a high-risk column so I catch shape changes, not just missing rows.

After those checks, each one needs an owner and a first response step.

Example

Here is the kind of failure I want the pipeline to catch before a dashboard does:

Field	Value
Model	fct_shipments

Record	row count	shipment_id	warehouse_id null rate	status values	regional split
Expected daily shape:	~1.2M	unique	< 0.5%	shipped, cancelled, corrected	east 34%, central 29%, west 37%
After upstream schema change:	1.19M	still unique	22.4%	unchanged	west volume collapses because null warehouse_id cannot map to region

I keep the first check query simple on purpose:

select
  count(*) as row_count,
  count(distinct shipment_id) as distinct_shipment_id,
  round(100.0 * avg(case when warehouse_id is null then 1 else 0 end), 2) as warehouse_id_null_pct
from fct_shipments;

On very large tables, I run the distinct check on the newest partition or a rolling window, then schedule a deeper scan less often.

Then I add one quick shape check next to it:

select
  region,
  count(*) as shipments
from fct_shipments
group by 1
order by 2 desc;

If the row count is stable but one region disappears, I do not need a long debate about whether the pipeline is healthy. I already know which mapping layer to inspect.

If these checks pass and two dashboards still disagree, I stop debugging the pipeline and move to definition ownership. The definition card I use to stop metric drift across dashboards is the handoff I make next.

I want a small number of business-aware checks, not a long catalog of generic ones. A null-rate check on warehouse_id earns its place. Ten low-risk checks on columns nobody uses usually do not earn a slot.

Tradeoffs

Breaks when: every column gets the same battery of checks → Mitigation: start with fields tied to joins, filters, revenue, inventory, or service-level reporting.
Breaks when: thresholds are copied from another model with different behavior → Mitigation: set baselines from recent history and review them when the source changes.
Breaks when: alerts fire but nobody knows the first investigation step → Mitigation: attach each check to an owner and a short runbook question such as “freshness, join, or mapping?”
Breaks when: teams assume schema tests cover everything important → Mitigation: add one or two data-shape checks that reflect real business failure modes, not just table structure.
Breaks when: full-table uniqueness checks are too expensive to run on every build → Mitigation: check uniqueness on the newest partition or a rolling window, and run a deeper scan on a slower cadence.

Close

Next step: Pick one business-critical model and add one freshness check, one volume check, one uniqueness check, one null-rate check, and one business-shape check before the next source change lands.

If a row-count check already passes while the business still distrusts the output, use one failed shape check from that model to decide which test deserves ownership first.

Row counts are not enough: the checks I add before I trust a pipeline

Problem

Default approach

Example

Tradeoffs

Close

Continue reading

When a dashboard number changes, I check these four things first

The definition card I use to stop metric drift across dashboards