> For the complete documentation index, see [llms.txt](https://v2.dataos.info/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://v2.dataos.info/concepts/resources/nilus/observability.md).

# Observability

Nilus observability is the set of signals that helps operators answer four questions: did the pipeline run, did it move the expected data, is it getting slower, and what should be inspected next?

## Observability Surfaces

| Surface            | What it answers                                                                | Reference                                                                                   |
| ------------------ | ------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------- |
| Pipeline logs      | Runtime errors, connector messages, stack traces, and final push status.       | [Checking Logs](/concepts/resources/nilus/troubleshooting/checking-logs.md)                 |
| Main API           | Run history, tenant-scoped pipeline listings, CDC offsets, and schema history. | [Observability API Endpoints](/concepts/resources/nilus/observability/api-endpoints.md)     |
| Metrics server     | Prometheus scrape target, health, and readiness for metrics.                   | [Exposed Prometheus Metrics](/concepts/resources/nilus/observability/prometheus-metrics.md) |
| Grafana dashboards | Triage, baseline drift, throughput, resource pressure, and freshness views.    | [Grafana Dashboards](/concepts/resources/nilus/observability/grafana-dashboard.md)          |

## Core Signals

| Signal                        | Healthy shape                                   | Investigate when                                               |
| ----------------------------- | ----------------------------------------------- | -------------------------------------------------------------- |
| `pipeline_status`             | Latest run reports success.                     | It is `0`, absent, or stale.                                   |
| `duration_sec`                | Stable for similar input volumes.               | It grows faster than row count or exceeds the schedule window. |
| `records_processed`           | Matches the expected source volume.             | It drops unexpectedly or is much larger than expected.         |
| `cpu_percent` and `memory_mb` | Resource use correlates with useful throughput. | Resource use rises while throughput falls.                     |
| Push age                      | Recent for active resources.                    | The series has not refreshed after a run or service heartbeat. |

## Triage Flow

1. Start with the latest run status and run age.
2. If the run failed, open logs before changing config.
3. If the run succeeded but the dataset is wrong, compare source row counts, destination row counts, and write strategy.
4. If the run is slow, compare duration, throughput, extract time, load time, CPU, and memory.
5. If CDC is involved, check source retention, connector lag, offsets, and schema history.
6. Record the resource ID, tenant, owner, pipeline mode, and exact error before escalating.

## Related Docs

* [Observability API Endpoints](/concepts/resources/nilus/observability/api-endpoints.md)
* [Exposed Prometheus Metrics](/concepts/resources/nilus/observability/prometheus-metrics.md)
* [Grafana Dashboards](/concepts/resources/nilus/observability/grafana-dashboard.md)
* [Troubleshooting](/concepts/resources/nilus/troubleshooting.md)


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://v2.dataos.info/concepts/resources/nilus/observability.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
