> For the complete documentation index, see [llms.txt](https://v2.dataos.info/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://v2.dataos.info/build/stage-2-productize/define-the-contract/assertions.md).

# Assertions

Assertions are blocking data quality assertions that run automatically every time a model materializes. If an assertion finds bad data (the assertion query returns rows), Vulcan stops everything; no bad data reaches production.

***

## How assertions work

An assertion is a SQL query that should return **zero rows**. If it returns any rows, those are rows that failed the check, and the assertion fails.

The execution sequence for every model run:

1. Vulcan evaluates the model SQL and inserts data.
2. Vulcan runs each attached assertion query against the newly inserted data.
3. If any assertion returns rows, execution stops and the error is reported.
4. If all assertions return zero rows, the model is promoted.

***

## Assertions vs assertions

Two terms come up frequently:

* **ASSERTION**: the validation rule itself (the SQL query).
* **ASSERTION**: attaching an assertion to a model (declaring the model must pass that rule).

You may see older code using `audits` in the `MODEL` block. Both `audits` and `assertions` work identically, but `assertions` is the preferred term in current Vulcan versions.

***

## Inline assertions (recommended)

The simplest way to add assertions is through built-in functions in the `MODEL` block's `assertions` property:

```sql
MODEL (
  name analytics.daily_sales,
  kind FULL,
  grains (order_date),
  assertions (
    not_null(columns := (order_date, total_revenue)),
    unique_values(columns := (order_date)),
    accepted_range(column := total_revenue, min_v := 0)
  )
);
```

These built-in functions are the fastest way to add standard checks. No SQL file is needed.

***

## Built-in assertion functions

### NULL and row count

| Function                                               | Description                                               |
| ------------------------------------------------------ | --------------------------------------------------------- |
| `not_null(columns := (col1, col2))`                    | Ensures none of the specified columns have NULL values.   |
| `at_least_one(column := col)`                          | Ensures at least one non-NULL value exists in the column. |
| `not_null_proportion(column := col, threshold := 0.8)` | Ensures at least 80% of values are non-NULL.              |
| `number_of_rows(threshold := 10)`                      | Ensures the model has more than N rows.                   |

### Uniqueness

| Function                                                 | Description                                               |
| -------------------------------------------------------- | --------------------------------------------------------- |
| `unique_values(columns := (col1, col2))`                 | Ensures each column individually has no duplicate values. |
| `unique_combination_of_columns(columns := (col1, col2))` | Ensures the combination of columns is unique across rows. |

### Value validation

| Function                                                  | Description                                                 |
| --------------------------------------------------------- | ----------------------------------------------------------- |
| `accepted_values(column := col, is_in := ('A', 'B'))`     | Ensures values are within an allowed set.                   |
| `not_accepted_values(column := col, is_in := ('X', 'Y'))` | Ensures values are NOT in a disallowed set.                 |
| `accepted_range(column := col, min_v := 0, max_v := 100)` | Ensures values are within a numeric range.                  |
| `not_constant(column := col)`                             | Ensures a column has at least two distinct non-NULL values. |
| `forall(criteria := (price > 0, LENGTH(name) > 0))`       | Flexible: write any boolean SQL expression.                 |
| `sequential_values(column := col, interval := 1)`         | Ensures values are sequential (no gaps).                    |

### String format

| Function                                                        | Description                                      |
| --------------------------------------------------------------- | ------------------------------------------------ |
| `not_empty_string(column := col)`                               | Ensures no values are empty strings.             |
| `string_length_equal(column := col, v := 5)`                    | Ensures all string values have the exact length. |
| `string_length_between(column := col, min_v := 5, max_v := 50)` | Ensures string lengths are within a range.       |
| `valid_uuid(column := col)`                                     | Validates UUID format.                           |
| `valid_email(column := col)`                                    | Validates email format.                          |
| `valid_url(column := col)`                                      | Validates URL format.                            |

### Statistical

| Function                                                 | Description                                                            |
| -------------------------------------------------------- | ---------------------------------------------------------------------- |
| `mean_in_range(column := col, min_v := 21, max_v := 50)` | Ensures column average is within a range.                              |
| `z_score(column := col, threshold := 3)`                 | Flags statistical outliers (values > N standard deviations from mean). |

***

## Custom user-defined assertions

Write custom assertion SQL files in the `audits/` directory for checks that built-in functions cannot express. These are SQL queries that should return zero rows. Any row returned is a failing row.

### Example 1: Checking metric consistency

`audits/daily_sales_metric_consistency.sql` from `orders-analytics` checks that the derived metrics in `silver.fct_daily_sales` are internally consistent. It flags rows where `total_orders` is zero or negative, any metric is negative, the shipment rate is out of range, or the computed average order value does not match the stored one:

```sql
AUDIT (
  name daily_sales_metric_consistency,
  blocking true
);

SELECT
  order_date,
  region_id,
  customer_id,
  product_id,
  total_orders,
  total_revenue,
  avg_order_value,
  shipment_rate,
  'Daily sales derived metrics are inconsistent' AS issue_type
FROM @this_model
WHERE total_orders <= 0
   OR total_items_sold < 0
   OR total_revenue < 0
   OR total_shipments < 0
   OR shipment_rate < 0
   OR shipment_rate > 1
   OR ROUND(total_revenue / NULLIF(total_orders, 0), 2) <> avg_order_value;
```

This assertion is attached to the model via the `assertions` block:

```sql
MODEL (
  name silver.fct_daily_sales,
  ...
  assertions (
    not_null(columns := (order_date, region_id, ...)),
    forall(criteria := (total_orders >= 0, ...)),
    daily_sales_metric_consistency()
  )
);
```

### Example 2: Validating RFM scores

`audits/rfm_score_consistency.sql` checks that every customer in `gold.rfm_customer_segmentation` has RFM component scores within the valid 1-5 range, that the concatenated `rfm_score` string matches the components, and that the `rfm_segment` label is one of the nine allowed values:

```sql
AUDIT (
  name rfm_score_consistency,
  blocking true
);

SELECT
  customer_id,
  recency_score,
  frequency_score,
  monetary_score,
  rfm_score,
  rfm_segment,
  'RFM score or segment is inconsistent' AS issue_type
FROM @this_model
WHERE recency_score NOT BETWEEN 1 AND 5
   OR frequency_score NOT BETWEEN 1 AND 5
   OR monetary_score NOT BETWEEN 1 AND 5
   OR rfm_score <> recency_score::TEXT || frequency_score::TEXT || monetary_score::TEXT
   OR rfm_segment NOT IN (
     'Champions', 'Loyal Customers', 'Potential Loyalists', 'Recent Customers',
     'Promising', 'Customers Needing Attention', 'At Risk', 'Hibernating', 'Lost'
   );
```

### Example 3: Referential integrity via a seed

`audits/order_status_lookup_consistency.sql` checks that every order status in `bronze.orders` exists in the `bronze.order_status_lookup` seed. This blocks new or unexpected status values from making it past the bronze layer:

```sql
AUDIT (
  name order_status_lookup_consistency,
  blocking true
);

SELECT
  orders.order_id,
  orders.order_status,
  'Order status is missing from bronze.order_status_lookup' AS issue_type
FROM @this_model AS orders
LEFT JOIN bronze.order_status_lookup AS lookup
  ON orders.order_status = lookup.order_status
WHERE lookup.order_status IS NULL;
```

`@this_model` refers to the model being audited (`bronze.orders` in this case). Attaching it to the model:

```sql
MODEL (
  name bronze.orders,
  ...
  assertions (
    unique_values(columns := (order_id)),
    not_null(columns := (order_id, customer_id, order_date, order_status)),
    accepted_values(column := order_status, is_in := ('Confirmed', 'Shipped', 'Cancelled')),
    order_status_lookup_consistency()
  )
);
```

`@this_model` refers to the model being audited. For incremental models, Vulcan adds an automatic time range filter so the assertion only checks the processed interval, not the entire table.

***

## Additional assertion patterns

Vulcan supports parameterized (generic) assertions for reuse across multiple models, global assertions applied to all models via `modelDefaults` in `config.yaml`, and inline assertions defined in the same file as the model. `orders-analytics` does not use these patterns. For the full reference, see[ Assertions](/concepts/resources/vulcan/components/assertions.md) in the Vulcan book.

***

## plan vs run behavior

| Command           | Assertion failure behavior                                                                                                                          |
| ----------------- | --------------------------------------------------------------------------------------------------------------------------------------------------- |
| `vulcan plan dev` | Assertion failure leaves production untouched. Bad data stays in the isolated dev table.                                                            |
| `vulcan run prod` | Assertion failure stops execution, but the invalid data is already in the production table. Downstream models will not be built using the bad data. |

Use `vulcan plan` for production changes. It is the safer option.

***

## Running assertions manually

```bash
vulcan audit --start 2024-01-01 --end 2024-01-31
```

Use `--verbose` to see the exact query and failing rows:

```bash
vulcan audit --start 2024-01-01 --end 2024-01-31 --verbose
```


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://v2.dataos.info/build/stage-2-productize/define-the-contract/assertions.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
