> For the complete documentation index, see [llms.txt](https://v2.dataos.info/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://v2.dataos.info/concepts/resources/worker.md).

# Worker

A Worker is a long-running DataOS workload for background execution. Use a Worker when a process must keep running without exposing a Service endpoint, such as a stream processor, queue consumer, scheduler loop, or custom container that continuously performs work.

Workers are different from Workflows. A Workflow executes one or more jobs and then completes. A Worker is reconciled as a running workload and remains active until it is updated, suspended, or deleted.

Workers are also different from Services. A Service exposes one or more network ports for consumers. A Worker runs background logic and does not need to expose a port.

## Prerequisites

To create a Worker, you need a tenant-specific role (**Tenant Admin** or **Data Developer**).

To use a Worker, you need resource-specific permission granted by the Worker owner.

## When to use

Use a Worker when you need to:

* Run a long-lived process without an exposed endpoint.
* Continuously consume or process events.
* Run a containerized background loop.
* Keep one or more replicas of the same background process active.
* Run stream processing or sinking logic using a supported stack.

Common examples include stream processors, queue consumers, data sinkers, schedulers, and monitoring or reconciliation loops.

## Manifest structure

A Worker manifest uses the common Resource metadata fields and a `spec` section that defines how the Worker runs.

{% tabs %}
{% tab title="Syntax" %}

```yaml
version: v2alpha
name: ${{worker-name}}
type: worker
tags:
  - ${{tag}}
description: ${{description}}
spec:
  compute: ${{compute-resource}}
  replicas: ${{replica-count}}
  executionMode: ${{execution-mode}}
  resources:
    requests:
      cpu: ${{cpu-request}}
      memory: ${{memory-request}}
    limits:
      cpu: ${{cpu-limit}}
      memory: ${{memory-limit}}
  stack: ${{stack-name}}
  stackSpec:
    ${{stack-specific-configuration}}
```

{% endtab %}

{% tab title="Example" %}

```yaml
version: v2alpha
name: sleep-worker
type: worker
tags:
  - test
  - worker
  - container
description: "Minimal long-running worker"
spec:
  compute: runnable-default
  replicas: 1
  executionMode: default
  resources:
    requests:
      cpu: 50m
      memory: 64Mi
    limits:
      cpu: 200m
      memory: 256Mi
  stack: container
  stackSpec:
    image: docker.io/library/alpine:3.20
    command:
      - sh
    arguments:
      - -c
      - "while true; do echo \"$(date -u +%FT%TZ) sleep-worker heartbeat\"; sleep 30; done"
```

{% endtab %}
{% endtabs %}

## Core concepts

### Replicas

`replicas` controls how many Worker instances DataOS maintains. Increase the replica count when the Worker logic can safely run in parallel.

```yaml
spec:
  replicas: 1
```

### Execution mode

`executionMode` defines how the Worker is executed by the runtime. Use `default` unless the Worker stack or platform guidance requires another mode.

```yaml
spec:
  executionMode: default
```

### Resources

Use `resources.requests` and `resources.limits` to control CPU and memory allocation for the Worker runtime.

```yaml
spec:
  resources:
    requests:
      cpu: 50m
      memory: 64Mi
    limits:
      cpu: 200m
      memory: 256Mi
```

### Stack and stackSpec

`stack` selects the runtime stack. `stackSpec` contains stack-specific configuration. For the `container` stack, provide the container image, command, and arguments.

```yaml
spec:
  stack: container
  stackSpec:
    image: docker.io/library/alpine:3.20
    command:
      - sh
    arguments:
      - -c
      - "while true; do echo heartbeat; sleep 30; done"
```

The older Worker examples commonly used Bento or Fast Fun stacks:

| Stack     | Worker purpose                                                                                                           |
| --------- | ------------------------------------------------------------------------------------------------------------------------ |
| Bento     | Runs long-lived stream processing that can read from a source, transform events, and write to a sink.                    |
| Fast Fun  | Sinks data from Pulsar-type depots such as Fastbase or system streams to Lakehouse storage or Iceberg-compatible depots. |
| Container | Runs a custom container image as a long-running background process.                                                      |

The fields under `stackSpec` are determined by the selected stack.

### Projections

Use `use.projection` when the Worker needs environment variables, Secrets, or files at runtime. A verification Workflow can also use projections to inject scripts and runtime configuration.

```yaml
spec:
  use:
    projection:
      projections:
        envVarsTemplate: |
          WORKER_NAME: ${{worker-name}}
          FQDN: https://${{dataos-fqdn}}
        files:
          - name: verify.py
            directory: /etc/app
            template: |
              print("verify worker")
```

### Storage

Workers can use inline disk storage or mount existing Volume Resources when a background process needs local or persistent files.

* Use `disk` when storage belongs to the Worker lifecycle.
* Use `use.volumes` when data must persist independently of the Worker.

See also: [Provision inline storage](/concepts/resources/worker.md) and [Persist storage across workloads](/concepts/resources/worker.md).

## Example: Sleep Worker

This example runs a minimal Alpine container as a long-running Worker. The command prints a heartbeat every 30 seconds.

```yaml
version: v2alpha
name: ${WORKER_NAME}
type: worker
tags:
  - test
  - worker
  - container
description: "Minimal long-running worker (alpine sleep loop) for poros worker-reconciler smoke test"
spec:
  compute: ${COMPUTE_NAME}
  replicas: 1
  executionMode: default
  resources:
    requests:
      cpu: 50m
      memory: 64Mi
    limits:
      cpu: 200m
      memory: 256Mi
  stack: container
  stackSpec:
    image: docker.io/library/alpine:3.20
    command:
      - sh
    arguments:
      - -c
      - "while true; do echo \"$(date -u +%FT%TZ) sleep-worker heartbeat\"; sleep 30; done"
```

## Deploy with a Bundle

For Workers that need verification or dependencies, package the resources in a Bundle so they are applied in order.

```yaml
version: v2alpha
name: test-worker-sleep
type: bundle
description: "Integration test: deploy a minimal long-running worker (container stack) and verify pod runtime is running"
tags:
  - test
  - worker
  - container
bundle:
  resources:
    - id: sleep-worker
      file: tests/worker/sleep-worker/worker.yaml

    - id: verify-sleep-worker
      file: tests/worker/sleep-worker/verify-workflow.yaml
      dependencies:
        - sleep-worker
      dependencyConditions:
        - resourceId: sleep-worker
          status:
            is:
              - active
          runtime:
            contains:
              - "running"
```

## Verify with a Workflow

A verification Workflow can fetch Worker metadata and assert that the Worker is active and running. The following fragment shows the key runtime checks from the verification script.

```python
status_obj = data.get("status") or {}
top_status = status_obj.get("aggregateStatus")
runtime_state = status_obj.get("runtimeState") or {}
runtime = runtime_state.get("status") or ""

if top_status != "active":
    errors.append(f"expected status=active, got {top_status!r}")
if "running" not in runtime:
    errors.append(f"expected 'running' in runtime, got {runtime!r}")
```

## Apply and manage

Apply the Worker:

```bash
dataos-ctl resource apply -f worker.yaml
```

Apply a Bundle:

```bash
dataos-ctl resource apply -f bundle.yaml
```

Check the Worker:

```bash
dataos-ctl resource get -t worker -n ${{worker-name}} -d
```

List Workers:

```bash
dataos-ctl resource get -t worker
dataos-ctl resource get -t worker -a
```

Get Worker logs:

```bash
dataos-ctl resource log -t worker -n ${{worker-name}}
```

Delete the Worker:

```bash
dataos-ctl resource delete -t worker -n ${{worker-name}}
```

Delete by identifier:

```bash
dataos-ctl resource delete -i "${{worker-name}}|v2alpha|worker"
```

Delete by manifest:

```bash
dataos-ctl resource delete -f worker.yaml
```

## Field reference

| Field                     | Description                                                                   | Required |
| ------------------------- | ----------------------------------------------------------------------------- | -------- |
| `version`                 | Manifest version. Use `v2alpha`.                                              | Yes      |
| `name`                    | Worker Resource name.                                                         | Yes      |
| `type`                    | Resource type. Use `worker`.                                                  | Yes      |
| `tags`                    | Labels for grouping and discovery.                                            | No       |
| `description`             | Human-readable description of the Worker.                                     | No       |
| `spec`                    | Worker runtime specification.                                                 | Yes      |
| `spec.compute`            | Compute Resource used to run the Worker.                                      | Yes      |
| `spec.replicas`           | Number of Worker replicas to maintain.                                        | No       |
| `spec.executionMode`      | Runtime execution mode. Use `default` for standard Workers.                   | No       |
| `spec.resources.requests` | CPU and memory requested for the Worker runtime.                              | No       |
| `spec.resources.limits`   | CPU and memory limits for the Worker runtime.                                 | No       |
| `spec.disk`               | Inline disk mounted into the Worker.                                          | No       |
| `spec.use.volumes`        | Existing Volume mounts consumed by the Worker.                                | No       |
| `spec.use.projection`     | Projects Secrets, environment variables, and files into the runtime.          | No       |
| `spec.stack`              | Runtime stack used by the Worker, such as `container`, `bento`, or `fastfun`. | Yes      |
| `spec.stackSpec`          | Stack-specific runtime configuration.                                         | Yes      |


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://v2.dataos.info/concepts/resources/worker.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
