> For the complete documentation index, see [llms.txt](https://v2.dataos.info/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://v2.dataos.info/concepts/resources/nilus.md).

# Nilus

Nilus is DataOS's unified data movement framework. It connects sources to analytical destinations through a single, consistent pipeline model. Whether you are copying a relational table on a schedule, streaming low-latency database changes, consuming a Kafka topic, or cataloging a data warehouse's schema, every Nilus pipeline is authored as a `type: nilus` resource with the same basic shape.

## How it works

A Nilus pipeline has two required pieces: a **source address** (where data comes from) and a **sink address** (where it goes). The runtime handles credentials, schema inference, state, retries, and loading.

```yaml
version: v1alpha
name: customers-batch
type: nilus
spec:
  type: batch
  compute: universe-compute
  source:
    address: dataos://postgres-source?purpose=ro
    options:
      source_table: public.customers
      incremental_key: updated_at
      primary_key: id
  sink:
    address: dataos://warehouse?purpose=rw
    options:
      dest_table: raw.customers
      incremental_strategy: merge
```

Source and sink addresses are either a `dataos://` depot reference (credentials resolved automatically) or a direct connector URI with secrets projected via `spec.use.projection`, depending on the connector. Some connectors are direct-URI only (for example, Databricks sources and IBM DB2 CDC source). See [Secrets and Projections](/concepts/resources/nilus/concepts/secrets-and-projections.md) for the full pattern.

## Pipeline modes

Nilus supports four pipeline modes. Choose the one that matches how the source exposes data.

| Mode         | When to use                                                                                                                                          | `spec.type` |
| ------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------- | ----------- |
| **Batch**    | Periodic table, view, file, SaaS, or warehouse loads, for sources where scheduled or incremental cursor extraction is sufficient                     | `batch`     |
| **CDC**      | Near-real-time row-level change capture from a database log, for sources that expose a durable change log                                            | `cdc`       |
| **Stream**   | Continuous consumption from an event-streaming system (Kafka or NATS+JetStream) as a long-running consumer service                                   | `stream`    |
| **Metadata** | Catalog extraction from a data source (schemas, lineage, column profiles, and usage) published into the DataOS metadata catalog without copying rows | `metadata`  |

## Connectors

Nilus ships with connectors for 25+ sources and 10 destinations.

**Sources include:** PostgreSQL, MySQL, MS SQL Server, MongoDB, IBM DB2, Snowflake, Databricks, AWS Redshift, Google BigQuery, Delta Lake, ClickHouse, Salesforce, HubSpot, Stripe, Google Analytics, Google Sheets, Kafka, NATS, and more.

**Destinations include:** DataOS Lakehouse (AWS, Azure, GCP), Snowflake, BigQuery, Databricks, Redshift, PostgreSQL, MS SQL Server, and MongoDB. All destinations support batch, CDC, and stream writes.

→ [Browse all sources](/concepts/resources/nilus/sources.md) · [Browse all destinations](/concepts/resources/nilus/destinations.md)

## Write strategies

Every sink uses one of three `incremental_strategy` values, which control how repeated runs interact with the destination table.

| Strategy  | Behavior                                                       | Best for                                       |
| --------- | -------------------------------------------------------------- | ---------------------------------------------- |
| `replace` | Drop and re-create the destination table on every run          | Small dimension tables, snapshots, SaaS data   |
| `append`  | Add new rows; never update or delete                           | Immutable events, logs, telemetry              |
| `merge`   | Upsert on `primary_key`: update existing rows, insert new ones | Mutable entity tables, CDC current-state views |

## Incremental extraction

For sources that grow over time, set `incremental_key` to a monotonically increasing column (typically `updated_at` or a sequence). Nilus records the last seen value after each successful run and uses it as the lower bound for the next extraction, so only new or changed rows are read.

Set `primary_key` alongside `incremental_key` when using `merge` strategy to tell Nilus which column identifies a row.

***

## Concepts

* [Architecture and Mechanism](/concepts/resources/nilus/concepts/architecture-and-mechanism.md)
* [Understanding Batch Data Movement](/concepts/resources/nilus/batch.md)
* [Understanding Change Data Capture](/concepts/resources/nilus/cdc.md)
* [Understanding Stream Data Movement](/concepts/resources/nilus/stream.md)
* [Understanding Data Masking](/concepts/resources/nilus/concepts/understanding-data-masking.md)
* [Schema Evolution](/concepts/resources/nilus/concepts/schema-evolution.md)

## Building Pipelines

* [Quick Start Guides](/concepts/resources/nilus/quick-start-guides.md)
* [Secrets and Projections](/concepts/resources/nilus/concepts/secrets-and-projections.md)
* [Understanding Batch Pipeline Config](/concepts/resources/nilus/batch/pipeline-config.md)
* [Batch Sample Configs](/concepts/resources/nilus/batch/sample-configs.md)
* [Understanding CDC Pipeline Config](/concepts/resources/nilus/cdc/service-config.md)
* [CDC Sample Configs](/concepts/resources/nilus/cdc/sample-configs.md)
* [Understanding Stream Pipeline Config](https://github.com/moderndatacompany/dataos/blob/main/documentation/concepts/resources/nilus/stream/pipeline-config.md)
* [Stream Sample Configs](/concepts/resources/nilus/stream/sample-configs.md)

## Metadata Pipelines

* [Metadata Pipelines Overview](/concepts/resources/nilus/metadata-pipelines.md)
* [Metadata Pipeline Config](/concepts/resources/nilus/metadata-pipelines/pipeline-config.md)
* [Metadata Sample Configs](/concepts/resources/nilus/metadata-pipelines/sample-configs.md)
* [Metadata Sources](/concepts/resources/nilus/metadata-pipelines/metadata-sources.md)

## Connectors

* [Sources](/concepts/resources/nilus/sources.md)
* [Batch Sources](/concepts/resources/nilus/batch/batch-sources.md)
* [CDC Sources](/concepts/resources/nilus/cdc/cdc-sources.md)
* [Stream Sources](/concepts/resources/nilus/stream/stream-sources.md)
* [Custom Sources](/concepts/resources/nilus/batch/custom-sources.md)
* [Destinations](/concepts/resources/nilus/destinations.md)

## Operate & Observe

* [Pipeline Optimization](/concepts/resources/nilus/pipeline-optimization.md)
* [Optimizing for Time](/concepts/resources/nilus/pipeline-optimization/optimizing-for-time.md)
* [Optimizing for Resource](/concepts/resources/nilus/pipeline-optimization/optimizing-for-resource.md)
* [Optimize Sink Datasets](/concepts/resources/nilus/pipeline-optimization/optimize-sink-datasets.md)
  * [Correctness Knobs](/concepts/resources/nilus/pipeline-optimization/optimize-sink-datasets/optimize-correctness-knobs.md)
  * [Shape Knobs](/concepts/resources/nilus/pipeline-optimization/optimize-sink-datasets/optimize-shape-knobs.md)
  * [Throughput Knobs](/concepts/resources/nilus/pipeline-optimization/optimize-sink-datasets/optimize-throughput-knobs.md)
  * [Sampling Knobs](/concepts/resources/nilus/pipeline-optimization/optimize-sink-datasets/optimize-sampling-knobs.md)
  * [Destination Gotchas & Troubleshooting](/concepts/resources/nilus/pipeline-optimization/optimize-sink-datasets/optimize-destination-gotchas-and-troubleshooting.md)
* [Observability](/concepts/resources/nilus/observability.md)
* [Observability API Endpoints](/concepts/resources/nilus/observability/api-endpoints.md)
* [Exposed Prometheus Metrics](/concepts/resources/nilus/observability/prometheus-metrics.md)
* [Grafana Dashboards](/concepts/resources/nilus/observability/grafana-dashboard.md)
* [Troubleshooting](/concepts/resources/nilus/troubleshooting.md)
* [Checking Logs](/concepts/resources/nilus/troubleshooting/checking-logs.md)
* [Common Errors](/concepts/resources/nilus/troubleshooting/common-errors.md)
* [Snowflake Key-Pair Authentication](/concepts/resources/nilus/troubleshooting/snowflake-key-pair-authentication.md)
* [Working with PostgreSQL Partitioned Tables for CDC](/concepts/resources/nilus/troubleshooting/postgresql-cdc-partitioned-tables.md)


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://v2.dataos.info/concepts/resources/nilus.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
