> For the complete documentation index, see [llms.txt](https://v2.dataos.info/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://v2.dataos.info/build/stage-1-discover/bring-data-in/cdc-ingestion.md).

# Change data capture

Use this page when you need to capture inserts, updates, and deletes from an operational database to the underlying engine without full reloads. Change data capture (CDC) reads from the database transaction log and delivers changes to the destination in near real-time.

***

## When to use CDC

Use CDC when:

* The use case requires near real-time data freshness.
* Full reloads are too slow or too costly for the source system.
* You need to preserve the history of every change (insert, update, or delete).
* The source is a transactional database that exposes a transaction log or change stream.

If the source does not support CDC (no WAL, binlog, oplog, or CDC tables), use [batch movement](/build/stage-1-discover/bring-data-in/batch-ingestion.md) instead.

***

## Before you start

CDC requires source-side setup before the pipeline can run.

<table><thead><tr><th width="180.1395263671875">Source</th><th>Required setup</th></tr></thead><tbody><tr><td><strong>PostgreSQL</strong></td><td><code>wal_level = logical</code>, a dedicated replication slot, and a user with the <code>REPLICATION</code> role.</td></tr><tr><td><strong>MySQL</strong></td><td>Binary logging enabled (<code>binlog_format = ROW</code>) and a user with <code>REPLICATION SLAVE</code> privilege.</td></tr><tr><td><strong>MongoDB</strong></td><td>Replica set enabled; user with <code>read</code> on the target database and the <code>local</code> database.</td></tr><tr><td><strong>MSSQL</strong></td><td>CDC feature enabled on the database and target tables; SQL Server Agent running.</td></tr><tr><td><strong>IBM Db2</strong></td><td>Db2 change data capture tables enabled for the target schema.</td></tr></tbody></table>

Also confirm:

* A DataOS Depot exists for the source and destination.
* A compute profile is available.

See the [CDC sources reference ](/concepts/resources/nilus/cdc/cdc-sources.md)for full per-source prerequisites.

***

{% stepper %}
{% step %}

## Choose your source system for CDC

CDC is supported by the five sources.

<table data-view="cards"><thead><tr><th align="center"></th><th data-hidden data-card-target data-type="content-ref"></th></tr></thead><tbody><tr><td align="center">PostgreSQL</td><td></td></tr><tr><td align="center">MySQL</td><td></td></tr><tr><td align="center">MongoDB</td><td></td></tr><tr><td align="center">IBM Db2</td><td></td></tr><tr><td align="center">Microsoft SQL Server</td><td></td></tr></tbody></table>
{% endstep %}

{% step %}

## Write the manifest

Create a `nilus` resource manifest with `spec.type: cdc`.

```yaml
name: ${{pipeline-name}}
version: v1alpha
type: nilus
tags:
  - nilus-cdc
description: ${{description}}

spec:
  type: cdc
  compute: ${{compute-profile}}
  logLevel: INFO

  resources:
    requests:
      cpu: "200m"
      memory: "256Mi"

  source:
    address: dataos://${{source-depot}}?purpose=rw
    options:
      strategy: flatten            # flatten nested events into rows
      max_table_nesting: "0"
    cdc:
      table.include.list: "${{schema.table}}"    # for SQL sources
      topic.prefix: "${{prefix}}"                # required; keep stable in production
      slot.name: "${{slot-name}}"                # PostgreSQL only

  sink:
    address: dataos://${{sink-depot}}?purpose=rw
    options:
      dest_table: ${{schema.table}}
      incremental_strategy: append  # append | merge
```

{% hint style="warning" %}
`topic.prefix` is used as a connector identity and is appended to the destination table name. Do not change it after the pipeline is in production.

For PostgreSQL, each CDC pipeline must use a **unique** `slot.name`. Reusing a slot across pipelines causes replication conflicts.
{% endhint %}

**Sink strategies for CDC**

| Strategy | When to use                                                                 |
| -------- | --------------------------------------------------------------------------- |
| `append` | Write every change event as a new row. Preserves full change history.       |
| `merge`  | Upsert using the primary key. Destination table reflects the current state. |

For the full attribute reference, see [CDC configuration](/concepts/resources/nilus/cdc/service-config.md).
{% endstep %}

{% step %}

## Apply the pipeline

```bash
dataos-ctl resource apply -f ${{path-to-manifest.yaml}}
```

Confirm the resource is active:

```bash
dataos-ctl resource get -t nilus -a
# or
dataos-ctl resource get -t nilus -n  ${{pipeline-name}}
```

{% endstep %}
{% endstepper %}

***

**Related reference**

* [Supported CDC sources](/concepts/resources/nilus/cdc/cdc-sources.md): per-source prerequisites, snapshot modes, and troubleshooting.
* [Supported batch sources](/concepts/resources/nilus/batch/batch-sources.md): per-connector prerequisites, options, and YAML examples.
* [Supported destinations](/concepts/resources/nilus/destinations.md): configure the sink for your target system.
* [Configurations reference](/concepts/resources/nilus/cdc/service-config.md): full batch attribute table.
* [Schema evolution](/concepts/resources/nilus/concepts/schema-evolution.md): how Nilus handles column additions and type changes.
* [Data masking](/concepts/resources/nilus/concepts/understanding-data-masking.md): mask or redact columns during movement.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://v2.dataos.info/build/stage-1-discover/bring-data-in/cdc-ingestion.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
