> For the complete documentation index, see [llms.txt](https://v2.dataos.info/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://v2.dataos.info/concepts/resources/nilus/batch/custom-sources/custom-query-sql-sources.md).

# Custom Query (SQL)

Custom Query sources let a batch pipeline read from the result of a SQL query instead of reading one table directly. Use this when the source is SQL-compatible but the required dataset needs filtering, joins, computed columns, or a governance-friendly query surface that cannot be modeled as a simple `source_table`.

Custom queries are an advanced batch-source path. Prefer a first-party table connector or a database view when that gives the same result with simpler operational behavior.

## Requirements

Connectivity and credentials must both be in place before the pipeline can run.

### Connectivity

* The Nilus runtime must reach the SQL source through a supported SQL connector or DataOS depot.
* The source role/user must be allowed to run the query and read every referenced table or view.
* The query should be tested directly against the source database before it is added to a Nilus manifest.

### Query requirements

| Requirement           | Description                                                                                                         |
| --------------------- | ------------------------------------------------------------------------------------------------------------------- |
| `query:` prefix       | `source.options.source_table` must start with `query:` so Nilus treats the value as SQL rather than a table name.   |
| Stable column names   | The query result should return stable, explicitly named columns. Avoid ambiguous duplicate names from joins.        |
| Incremental column    | For incremental loads, the query must return the column named in `incremental_key`.                                 |
| Source-side filtering | Nilus does not rewrite or optimize the SQL. Put any filters, joins, and incremental predicates in the query itself. |

## Source options

| Option            | Required | Description                                                                                       |
| ----------------- | -------- | ------------------------------------------------------------------------------------------------- |
| `source_table`    | Yes      | SQL string prefixed with `query:`.                                                                |
| `incremental_key` | No       | Timestamp or numeric column returned by the query and used to identify newly visible rows.        |
| `interval_start`  | No       | Optional lower bound passed into the extraction window.                                           |
| `interval_end`    | No       | Optional upper bound passed into the extraction window.                                           |
| `sql_limit`       | No       | Caps total rows extracted per run. Useful for sampling and validation.                            |
| `type_hints`      | No       | Object map of `column_name: <type>` to override inferred types when query inference is ambiguous. |

## Sample Nilus configs

Each example below is self-contained and uses the current Nilus pipeline shape.

### Append from a join query

```yaml
name: custom-query-orders
version: v1alpha
type: nilus
spec:
  type: batch
  compute: runnable-default
  source:
    address: dataos://postgres-source?purpose=rw
    options:
      source_table: >
        query:select
          oi.id,
          oi.order_id,
          oi.sku,
          oi.quantity,
          o.updated_at
        from order_items oi
        join orders o
          on oi.order_id = o.id
  sink:
    address: dataos://analytics-lakehouse?purpose=rw
    options:
      dest_table: sales.order_items_enriched
      incremental_strategy: append
```

### Incremental custom query

```yaml
name: custom-query-orders-incremental
version: v1alpha
type: nilus
spec:
  type: batch
  compute: runnable-default
  source:
    address: dataos://postgres-source?purpose=rw
    options:
      source_table: >
        query:select
          oi.id,
          oi.order_id,
          oi.sku,
          oi.quantity,
          o.updated_at
        from order_items oi
        join orders o
          on oi.order_id = o.id
        where o.updated_at > :interval_start
      incremental_key: updated_at
      primary_key: id
  sink:
    address: dataos://analytics-lakehouse?purpose=rw
    options:
      dest_table: sales.order_items_enriched
      incremental_strategy: merge
```

## Behavior and capabilities

* **Pipeline mode**: custom queries run as `spec.type: batch` pipelines.
* **Query execution**: Nilus sends the SQL to the source as authored. Database-specific SQL syntax applies.
* **Schema inference**: Nilus infers the output schema from the query result.
* **Incrementality**: incremental state only works when the query returns `incremental_key` and filters with the current interval.
* **CDC support**: custom queries are not supported for CDC sources.

## Troubleshooting

| Symptom                              | Likely cause                                                                                             | Resolution                                                                                      |
| ------------------------------------ | -------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------- |
| Query fails before extraction        | SQL syntax is invalid for the source database, or referenced objects are not visible to the source role. | Test the query directly in the database using the same role/user.                               |
| Incremental run reprocesses old rows | The query does not filter with `:interval_start`, or the returned `incremental_key` is not stable.       | Return the incremental column and add an explicit interval predicate to the query.              |
| Schema changes unexpectedly          | The query uses `SELECT *`, ambiguous aliases, or source tables changed shape.                            | Select explicit columns and alias computed expressions.                                         |
| Extraction is slow                   | Query plan is expensive or lacks indexes on join/filter columns.                                         | Review the source database query plan and index the incremental/filter columns.                 |
| CDC manifest rejects the query       | Custom queries are batch-only.                                                                           | Use a batch pipeline for custom queries, or use a CDC source page for log-based change capture. |

## Best practices

* Prefer database views when the query is stable and can be governed upstream.
* Keep query logic deterministic; avoid non-deterministic functions such as `NOW()` or `RANDOM()` in incremental predicates.
* Explicitly alias computed columns.
* Validate row counts and schema in a non-production sink before scheduling production runs.
* Keep queries narrow and bounded. Large cross-source joins should be modeled upstream or in a transformation layer.

## Related docs

* [Custom sources](/concepts/resources/nilus/batch/custom-sources.md)
* [Batch sources](/concepts/resources/nilus/batch/batch-sources.md)
* [Understanding Batch Pipeline Config](/concepts/resources/nilus/batch/pipeline-config.md)
* [Batch sample configs](/concepts/resources/nilus/batch/sample-configs.md)


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://v2.dataos.info/concepts/resources/nilus/batch/custom-sources/custom-query-sql-sources.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
