> For the complete documentation index, see [llms.txt](https://v2.dataos.info/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://v2.dataos.info/concepts/resources/nilus/stream.md).

# Stream Pipelines

Not every source can be read on a schedule. Event-streaming systems such as Kafka and NATS produce records continuously, and a batch snapshot approach would mean delayed or partial data. Nilus stream mode keeps a consumer connected to the source and processes new records in micro-batches as they arrive, writing them to the configured destination without waiting for a scheduled trigger.

## Overview

Nilus models stream ingestion as `type: nilus` with `spec.type: stream`. The manifest declares a source topic or stream, a sink, and the consumer settings, while Nilus handles the runtime service, offset tracking, and destination loading. Stream is the right fit when the source publishes events continuously and the destination needs to reflect those events with low lag. For the full list of `source.options` and `sink.options` keys, tables, and validation notes, see [Understanding Stream Pipeline Config](https://github.com/moderndatacompany/dataos/blob/main/documentation/concepts/resources/nilus/stream/pipeline-config.md). That page is the single reference for stream manifest authoring; this page explains the conceptual model only.

## Core Capabilities

Stream mode is built around the following capabilities:

### Continuous Consumption

* Stream pipelines stay connected after startup and pull new records indefinitely when `run_in_loop: true` is set.
* Nilus tracks consumer positions (Kafka consumer-group offsets, NATS JetStream sequence numbers) so a restart resumes from the last committed position without reprocessing old records.
* Consumer identity (`group_id` for Kafka, `durable=` for NATS) must stay stable across deployments for offset continuity to hold.

### Micro-Batch Model

* Records are pulled from the source in configurable batches before being written to the sink.
* `batch_size` and batch timeout control how many records are accumulated per write and how long Nilus waits to fill a batch on a quiet topic.
* This model balances throughput against write latency and makes it straightforward to tune for different traffic profiles.

### Flexible Destination Writes

* Stream sinks follow the same destination model as batch and CDC pipelines.
* `incremental_strategy: append` is the default for event-shaped data.
* `merge` is supported when the destination has a stable primary key and deduplication is required.

## Flow

1. Nilus connects to the source broker or cluster using the configured URI and projected credentials.
2. A consumer pulls a micro-batch of records from the topic or stream.
3. Nilus applies any configured masking or shaping to the batch.
4. The sink writes the batch to the destination and Nilus commits the consumer offset.
5. Steps 2–4 repeat continuously while `run_in_loop: true` is set.

## Constraints

* Stream pipelines are long-running services, not cron jobs. Do not add a `spec.schedule` block.
* `run_in_loop: true` must be set explicitly under `source.options`; without it, Nilus exits after the first pull.
* Source offset retention governs how far back a restarted pipeline can resume. If the consumer falls behind beyond the retention window, a re-snapshot may be required.
* Destination behavior still depends on the chosen sink; the destination page is the source of truth for target-specific options.

## Related Docs

* [Understanding Stream Pipeline Config](https://github.com/moderndatacompany/dataos/blob/main/documentation/concepts/resources/nilus/stream/pipeline-config.md)
* [Stream Sample Configs](/concepts/resources/nilus/stream/sample-configs.md)
* [Sources](/concepts/resources/nilus/sources.md)
* [Destinations](/concepts/resources/nilus/destinations.md)


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://v2.dataos.info/concepts/resources/nilus/stream.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
