> For the complete documentation index, see [llms.txt](https://v2.dataos.info/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://v2.dataos.info/concepts/resources/nilus/batch/batch-sources/clickhouse.md).

# ClickHouse

[ClickHouse](https://clickhouse.com/docs) is a column-oriented SQL database designed for analytical workloads. Nilus reads from ClickHouse as a batch source through the standard SQL extraction path: it speaks the native ClickHouse protocol and either reads a single table or executes a custom SQL statement.

Nilus rewrites the user-facing `clickhouse://` URI internally before connecting. It defaults `secure=1` (TLS) when the parameter is omitted, injects a CA bundle (`ca_certs=certifi.where()`) when none is supplied, and strips destination-only parameters such as `http_port`. The actual connection runs over `clickhouse+native`.

## Requirements

Connectivity and credentials must both be in place before the pipeline can run.

### Connectivity

* The Nilus runtime must reach the ClickHouse server on the native TCP port (typically `9440` for TLS, `9000` for plaintext).
* The ClickHouse user needs `SELECT` privileges on the target database and on every object referenced in a custom query.
* Pass credentials through Nilus secrets/projections rather than embedding them in the YAML manifest.

### Required parameters

| Parameter  | Required | Default           | Description                                                                                             |
| ---------- | -------- | ----------------- | ------------------------------------------------------------------------------------------------------- |
| `username` | Yes      | -                 | ClickHouse user.                                                                                        |
| `password` | Yes      | -                 | Password for the ClickHouse user.                                                                       |
| `host`     | Yes      | -                 | ClickHouse server hostname.                                                                             |
| `port`     | Yes      | -                 | Native ClickHouse TCP port. Use `9440` for TLS, `9000` for plaintext.                                   |
| `secure`   | No       | `1`               | `1` enables TLS, `0` disables it. Nilus injects `secure=1` automatically when the parameter is omitted. |
| `ca_certs` | No       | `certifi.where()` | Path to a CA bundle. Nilus injects the `certifi` bundle automatically when the parameter is omitted.    |

> **Note** `http_port` is a destination-only option. Nilus removes it from the source URI before connecting; do not rely on it on the source side.

### URI format

```
clickhouse://<username>:<password>@<host>:<port>?secure=<0|1>
```

## Source options

| Option            | Required | Description                                                                                                                     |
| ----------------- | -------- | ------------------------------------------------------------------------------------------------------------------------------- |
| `source_table`    | Yes      | `<database>.<table>` for a standard table read, or `query:<SQL>` for a custom SQL extraction.                                   |
| `incremental_key` | No       | Column used as the incremental cursor. Set this only when the column is monotonically increasing.                               |
| `interval_start`  | No       | Lower bound passed to the incremental cursor on the first run. Subsequent runs ignore this and resume from the last seen value. |
| `interval_end`    | No       | Upper bound passed to the incremental cursor.                                                                                   |

## Sink options

| Option                 | Required | Description                                                                                                                                                                                                                                             |
| ---------------------- | -------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `dest_table`           | Yes      | Destination table for the extracted dataset. Use one stable target per source dataset so re-runs merge cleanly.                                                                                                                                         |
| `incremental_strategy` | Yes      | Destination write behavior. Common choices are `replace` for full snapshots and `append` for monotonic incremental loads. The exact set of supported strategies is destination-specific, see [Destinations](/concepts/resources/nilus/destinations.md). |

## Sample Nilus config

Each example below is self-contained and uses the current Nilus pipeline shape.

### Standard table read

```yaml
name: clickhouse-orders-batch
version: v1alpha
type: nilus
spec:
  type: batch
  compute: universe-compute
  source:
    address: clickhouse://{CLICKHOUSE_USERNAME}:{CLICKHOUSE_PASSWORD}@clickhouse.example.com:9440?secure=1
    options:
      source_table: analytics.orders
  sink:
    address: dataos://warehouse?purpose=rw
    options:
      dest_table: analytics.clickhouse_orders
      incremental_strategy: replace
```

### Custom SQL extraction

```yaml
spec:
  type: batch
  compute: universe-compute
  source:
    address: clickhouse://{CLICKHOUSE_USERNAME}:{CLICKHOUSE_PASSWORD}@clickhouse.example.com:9440?secure=1
    options:
      source_table: "query:SELECT order_id, customer_id, total, created_at FROM analytics.orders WHERE created_at >= today() - INTERVAL 7 DAY"
  sink:
    address: dataos://warehouse?purpose=rw
    options:
      dest_table: analytics.recent_orders
      incremental_strategy: replace
```

## Behavior and capabilities

* **Pipeline mode**: ClickHouse runs as a `batch` source.
* **Connection model**: Nilus connects through the native ClickHouse protocol after normalizing the user-facing `clickhouse://` URI. TLS is enabled by default unless `secure=0` is set explicitly.
* **Object model**: standard reads use fully qualified `<database>.<table>` names. Custom SQL reads use `source_table: "query:<SQL>"`.
* **Incremental extraction**: set `incremental_key` only when the source column is monotonic and safe to use as a cursor. `interval_start` can seed the first run; later runs resume from the stored cursor.
* **Destination strategy**: use `replace` for full snapshots and `append` or `merge` only when the destination table has a clear incremental or deduplication strategy.

## Troubleshooting

| Symptom                                                               | Likely cause                                                                               | Resolution                                                                                                                                                            |
| --------------------------------------------------------------------- | ------------------------------------------------------------------------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `Connection refused` on port 9440                                     | Server is listening on the plaintext port (`9000`) only.                                   | Either use port `9000` with `secure=0`, or enable TLS on the server and reconnect on `9440` with `secure=1`.                                                          |
| TLS handshake / certificate verification error                        | Server certificate is signed by an internal CA that is not in the bundled `certifi` store. | Pass `ca_certs=<absolute-path-to-ca-bundle>` in the URI query so the rewriter does not fall back to `certifi.where()`.                                                |
| `Code: 60. Table … doesn't exist`                                     | `source_table` was supplied as a bare name without the database prefix.                    | Use the fully qualified `<database>.<table>` form.                                                                                                                    |
| Custom-query extraction returns 0 rows                                | The `query:` prefix is missing or the SQL is invalid.                                      | Prefix the statement with `query:` exactly, then validate the SQL directly against the cluster (for example with `clickhouse-client`) before re-running the pipeline. |
| Pipeline silently switches to TLS even though the server is plaintext | Nilus defaults `secure=1` when the parameter is missing.                                   | Set `secure=0` explicitly in the URI for plaintext connections.                                                                                                       |

## Related docs

* [Custom query (SQL sources)](/concepts/resources/nilus/batch/custom-sources/custom-query-sql-sources.md): patterns for `query:<SQL>` extraction across all SQL sources.
* [Understanding Batch Pipeline Config](/concepts/resources/nilus/batch/pipeline-config.md)
* [Batch sample configs](/concepts/resources/nilus/batch/sample-configs.md)


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://v2.dataos.info/concepts/resources/nilus/batch/batch-sources/clickhouse.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
