> For the complete documentation index, see [llms.txt](https://v2.dataos.info/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://v2.dataos.info/concepts/resources/lakehouse/command-reference.md).

# Command reference

## Lakehouse resource management commands

Reference for the commands you use to manage Lakehouses in DataOS.

{% hint style="info" %}
If you do not have the required permissions, contact your DataOS Operator.
{% endhint %}

### Applying a Lakehouse

Applying the Lakehouse Resource manifest creates a Lakehouse Resource instance in the DataOS environment.

**Command**

```bash
dataos-ctl resource apply -f ${manifest-file-path}
```

**Flags**

| Flag                      | Description                                                |
| ------------------------- | ---------------------------------------------------------- |
| `-f`, `--manifestFile`    | Manifest file location (**required**)                      |
| `-l`, `--lint`            | Lint the files without applying                            |
| `-d`, `--de-ref`          | De-reference the files without applying                    |
| `-R`, `--recursive`       | Get manifest files recursively from the provided directory |
| `--disable-interpolation` | Disable interpolation of `$ENV` / `${ENV}` variables       |
| `-r`, `--re-run`          | Re-run resource after apply                                |

**Example**

```bash
dataos-ctl resource apply -f ./lakehouse/s3-lakehouse.yaml
```

***

### Get Lakehouse status

Retrieve the status of Lakehouses to see their current operational state.

**Command**

```bash
dataos-ctl resource get -t lakehouse
```

**Flags**

| Flag                           | Description                                                               |
| ------------------------------ | ------------------------------------------------------------------------- |
| `-t`, `--type`                 | Type of resource to query (use `lakehouse`)                               |
| `-n`, `--name`                 | Name of a specific Lakehouse to query                                     |
| `-a`, `--all`                  | Get resources for all owners                                              |
| `-d`, `--details`              | Include detailed spec, runtime state, and active properties in the result |
| `-b`, `--buildResourceDetails` | Include build resource details (Kubernetes resources) in the result       |
| `-o`, `--owner`                | Get resources for a specific owner ID                                     |
| `--tags`                       | Filter resources with specific tags (comma separated)                     |
| `-r`, `--refresh`              | Auto refresh the results                                                  |
| `--refreshRate`                | Refresh rate in seconds (default: 5)                                      |

**List your Lakehouses:**

```bash
dataos-ctl resource get -t lakehouse
```

**List all Lakehouses (all owners):**

```bash
dataos-ctl resource get -t lakehouse -a
```

**Get a specific Lakehouse by name:**

```bash
dataos-ctl resource get -t lakehouse -n ${lakehouse-name}
```

***

### Inspect Lakehouse details

Two flags give detailed inspection for debugging and operational visibility.

**Get detailed spec and runtime information (`-d`):**

```bash
dataos-ctl resource get -t lakehouse -n ${lakehouse-name} -d
```

The `-d` flag shows:

* All submitted spec details
* Runtime state and overall Lakehouse status
* Build/bid status
* **Active properties**: including internal URLs for each service (REST Catalog, Spark Cluster, Sherpa)

**Get build-level Kubernetes resource details (`-b`):**

```bash
dataos-ctl resource get -t lakehouse -n ${lakehouse-name} -b
```

The `-b` flag shows:

* Individual Kubernetes resources created during the build (secrets, services, StatefulSets, deployments, ingresses)
* Status of each Kubernetes resource
* Useful for debugging failed creations by identifying which component failed

***

### Generate Lakehouse JSON schema

Generate the JSON schema for a Lakehouse resource for a specified version:

**Command**

```bash
dataos-ctl develop schema generate -t lakehouse -v ${version}
```

**Example**

```bash
dataos-ctl develop schema generate -t lakehouse -v v2alpha
```

***

### Get Lakehouse manifest schema

Obtain the manifest schema for a Lakehouse:

**Command**

```bash
dataos-ctl develop schema get manifest -t lakehouse -v ${version}
```

**Example**

```bash
dataos-ctl develop schema get manifest -t lakehouse -v v1alpha
```

***

### Deleting a Lakehouse

To delete a specific Lakehouse, use any of the following methods:

{% tabs %}
{% tab title="Method 1" %}
Delete by identifier string.

```bash
dataos-ctl resource delete -i "${lakehouse-name} | v2alpha | lakehouse"
```

{% endtab %}

{% tab title="Method 2" %}
Delete using the manifest file.

```bash
dataos-ctl resource delete -f ${manifest-file-path}
```

{% endtab %}

{% tab title="Method 3" %}
Delete by resource type and name.

```bash
dataos-ctl resource delete -t lakehouse -n ${lakehouse-name}
```

{% endtab %}
{% endtabs %}

**Flags**

| Flag                   | Description                                                           |
| ---------------------- | --------------------------------------------------------------------- |
| `-i`, `--identifier`   | Identifier of resource (`NAME\|VERSION\|TYPE` or `TYPE:VERSION:NAME`) |
| `-f`, `--manifestFile` | Manifest file location                                                |
| `-t`, `--type`         | Type of resource                                                      |
| `-n`, `--name`         | Name(s) of resource (supports multiple: `name1 name2 name3`)          |
| `--cascade`            | Cascade delete inbound dependencies                                   |
| `--force`              | Force delete even if dependencies exist                               |

**Example**

```bash
dataos-ctl resource delete -t lakehouse -n s3pglh
```

***

### TCP stream (port forwarding)

Open a TCP stream to forward requests from a local system to a service running inside the cluster. Use it to port-forward to the Spark Cluster for direct query submission.

**Command**

```bash
dataos-ctl resource tcp-stream -t lakehouse -n ${lakehouse-name}
```

***

## Lakehouse operations

Lakehouse operations perform maintenance and management tasks on Iceberg tables within a Lakehouse. Submit, track, and manage them through the `lakehouse operation` CLI command group.

**Tip**: use built-in help at any time:

```bash
dataos-ctl lakehouse --help
dataos-ctl lakehouse operation --help
```

### Full command tree

```
dataos-ctl lakehouse  (aliases: lh, lakehouses, lake-house)
├── namespace  (aliases: ns, namespaces)
│   └── list
├── table  (aliases: tb, tables)
│   ├── list
│   ├── get
│   ├── create
│   ├── schema
│   │   ├── get
│   │   ├── add-field
│   │   ├── drop-field
│   │   ├── rename-field
│   │   ├── update-field
│   │   └── set-nullable
│   ├── partition
│   │   ├── get
│   │   └── update
│   ├── snapshot
│   │   ├── list
│   │   ├── set
│   │   ├── rollback
│   │   └── cherrypick
│   ├── branch
│   │   ├── list
│   │   ├── create
│   │   ├── delete
│   │   ├── rename
│   │   ├── replace
│   │   └── fastforward
│   ├── metadata
│   │   ├── get
│   │   └── set
│   └── properties  (alias: props)
│       ├── get
│       ├── add
│       └── remove
└── operation  (aliases: op, operations)
    ├── apply
    ├── get
    └── delete
```

### Supported operation types

You can perform the following operations on Iceberg tables within a Lakehouse:

| Operation              | Description                                                             |
| ---------------------- | ----------------------------------------------------------------------- |
| `COMPACT`              | Compact small data files into larger ones for improved read performance |
| `COMPACT_DELETE_FILES` | Compact delete files to optimize storage                                |
| `EXPIRE_SNAPSHOTS`     | Clean up old snapshots to reclaim storage                               |
| `REWRITE_MANIFESTS`    | Rewrite manifest files for optimization                                 |
| `DROP_TABLE`           | Remove an Iceberg table from the Lakehouse                              |
| `REMOVE_ORPHAN_FILES`  | Remove orphan files that are no longer referenced by any snapshot       |

### Operation manifest format

Submit operations through a YAML manifest specifying a namespace, table, operation type, and operation-specific options:

```yaml
namespace: "${namespace}"
table: "${table-name}"
operation: "${operation-type}"
options:
  # key-value options specific to the operation
```

On submission, an **operation ID** is returned. Use it to track status.

***

### Apply an operation

Submit a new Lakehouse operation from a manifest file.

**Command**

```bash
dataos-ctl lakehouse operation apply [flags]
```

**Flags**

| Flag                    | Description                                        |
| ----------------------- | -------------------------------------------------- |
| `-n`, `--lakehouseName` | Name of the Lakehouse resource (**required**)      |
| `-f`, `--manifestFile`  | Path to the operation manifest file (**required**) |

**Example**

```bash
dataos-ctl lakehouse operation apply \
  -n s3pglh \
  -f ./manifests/compact-events.yaml
```

***

### Get operation(s)

Retrieve operation details. Get a single operation by ID or list multiple operations with optional filters.

**Command**

```bash
dataos-ctl lakehouse operation get [flags]
```

**Flags**

| Flag                    | Description                                                                                                                                |
| ----------------------- | ------------------------------------------------------------------------------------------------------------------------------------------ |
| `-n`, `--lakehouseName` | Lakehouse resource name (**required**)                                                                                                     |
| `--id`                  | ID of a specific Lakehouse operation                                                                                                       |
| `--namespace`           | Filter by namespace                                                                                                                        |
| `--table`               | Filter by table                                                                                                                            |
| `--operation`           | Filter by operation type (`COMPACT`, `COMPACT_DELETE_FILES`, `EXPIRE_SNAPSHOTS`, `REWRITE_MANIFESTS`, `DROP_TABLE`, `REMOVE_ORPHAN_FILES`) |
| `--status`              | Filter by operation status (`QUEUED`, `RUNNING`, `COMPLETED`, `FAILED`)                                                                    |
| `--page`                | Page number (default: 1)                                                                                                                   |
| `--size`                | Page size (default: 10)                                                                                                                    |

**List all operations for a Lakehouse:**

```bash
dataos-ctl lakehouse operation get -n ${lakehouse-name}
```

**Get a specific operation by ID:**

```bash
dataos-ctl lakehouse operation get \
  -n ${lakehouse-name} \
  --id ${operation-id}
```

**Filter operations by namespace, table, type, and status:**

```bash
dataos-ctl lakehouse operation get -n ${lakehouse-name} \
  --namespace dataos \
  --table city_nov_24 \
  --operation COMPACT \
  --status COMPLETED
```

**Paginate results:**

```bash
dataos-ctl lakehouse operation get -n ${lakehouse-name} --page 2 --size 20
```

***

### Delete an operation

Delete an operation by its ID.

**Command**

```bash
dataos-ctl lakehouse operation delete [flags]
```

**Flags**

| Flag                    | Description                                            |
| ----------------------- | ------------------------------------------------------ |
| `-n`, `--lakehouseName` | Lakehouse resource name (**required**)                 |
| `--id`                  | ID of the Lakehouse operation to delete (**required**) |

**Example**

```bash
dataos-ctl lakehouse operation delete \
  -n ${lakehouse-name} \
  --id ${operation-id}
```

***

### Aliases and shortcuts

Use these shortened equivalents:

| Command Group | Aliases                          |
| ------------- | -------------------------------- |
| `lakehouse`   | `lh`, `lakehouses`, `lake-house` |
| `operation`   | `op`, `operations`               |
| `namespace`   | `ns`, `namespaces`               |
| `table`       | `tb`, `tables`                   |
| `properties`  | `props`                          |

**Examples using aliases:**

```bash
dataos-ctl lh op apply -n s3pglh -f ./op.yaml
dataos-ctl lh op get -n s3pglh
dataos-ctl lh op get -n s3pglh --id abc123
dataos-ctl lh op delete -n s3pglh --id abc123
dataos-ctl lh ns list -n s3pglh
dataos-ctl lh tb list -n s3pglh --namespace sales
```

***

### Operation manifest examples

Sample manifests for each supported operation type. The examples show a subset of the parameters each operation supports. For complete parameter references and advanced usage, see [Procedures - Apache Iceberg™](https://iceberg.apache.org/docs/latest/maintenance/).

#### Compact

Compact small data files into larger ones for improved read performance.

```yaml
namespace: dataos
table: city_nov_24
operation: COMPACT
options:
  delete_file_threshold: 1
  rewrite_all: true
```

**Options:**

| Option                  | Type    | Description                                          |
| ----------------------- | ------- | ---------------------------------------------------- |
| `delete_file_threshold` | integer | Minimum number of delete files to trigger compaction |
| `rewrite_all`           | boolean | Whether to rewrite all data files                    |

#### Compact delete files

Compact delete files to optimize storage and query performance.

```yaml
namespace: dataos
table: city_nov_24
operation: COMPACT_DELETE_FILES
options:
  delete_file_threshold: 1
  compression_factor: 1.5
  shuffle_partitions_per_file: 20
  use_starting_sequence_number: true
  remove_dangling_deletes: true
```

**Options:**

| Option                         | Type    | Description                                                    |
| ------------------------------ | ------- | -------------------------------------------------------------- |
| `delete_file_threshold`        | integer | Minimum number of delete files to trigger compaction           |
| `compression_factor`           | float   | Compression factor for file sizing                             |
| `shuffle_partitions_per_file`  | integer | Number of shuffle partitions per file                          |
| `use_starting_sequence_number` | boolean | Use starting sequence number for rewriting                     |
| `remove_dangling_deletes`      | boolean | Remove delete files that no longer reference active data files |

#### Expire snapshots

Clean up old snapshots to reclaim storage space.

```yaml
namespace: dataos
table: city_nov_24
operation: EXPIRE_SNAPSHOTS
options:
  older_than: "2025-11-24T00:00:00Z"
  retain_last: 1
  max_concurrent_deletes: 10
```

**Options:**

| Option                   | Type              | Description                                |
| ------------------------ | ----------------- | ------------------------------------------ |
| `older_than`             | string (ISO 8601) | Expire snapshots older than this timestamp |
| `retain_last`            | integer           | Number of most recent snapshots to retain  |
| `max_concurrent_deletes` | integer           | Maximum number of concurrent file deletes  |

#### Rewrite manifests

Rewrite manifest files to optimize read planning and metadata performance.

```yaml
namespace: dataos
table: city_nov_24
operation: REWRITE_MANIFESTS
options:
  use_caching: true
```

**Options:**

| Option        | Type    | Description                                      |
| ------------- | ------- | ------------------------------------------------ |
| `use_caching` | boolean | Whether to use caching during manifest rewriting |

#### Drop table

Remove an Iceberg table from the Lakehouse.

```yaml
namespace: dataos
table: city_oct_29
operation: DROP_TABLE
options:
  purge: true
  cleanup_store: false
```

**Options:**

| Option          | Type    | Description                                                                 |
| --------------- | ------- | --------------------------------------------------------------------------- |
| `purge`         | boolean | If `true`, permanently deletes the table data files in addition to metadata |
| `cleanup_store` | boolean | If `true`, cleans up the storage location                                   |

#### Remove orphan files

Remove orphan files that are no longer referenced by any table snapshot.

```yaml
namespace: dataos
table: city_nov_24
operation: REMOVE_ORPHAN_FILES
options:
  older_than: "2025-11-24T00:00:00Z"
```

**Options:**

| Option       | Type              | Description                                   |
| ------------ | ----------------- | --------------------------------------------- |
| `older_than` | string (ISO 8601) | Remove orphan files older than this timestamp |

***

## Lakehouse table management

Use `dataos-ctl lakehouse` (alias `lh`) commands to manage namespaces, tables, schemas, partitions, snapshots, branches, metadata, and properties on Iceberg tables inside a Lakehouse.

**Prerequisites**: a configured DataOS CLI context, an applied Lakehouse resource, and a namespace to work with.

### Identifier format

Table-scoped commands (such as `table get`, `schema *`, `snapshot *`, `branch *`, `metadata *`, `properties *`) accept either:

* **`-i` (identifier)**: the full three-part `lakehouse:namespace:table` format (or `lakehouse|namespace|table` with pipe delimiters). When `-i` is set, all split flags (`-n`, `--namespace`, `--table`) are ignored.
* **`-n` + `--namespace` + `--table`**: individual flags used when `-i` is not set.

Commands with narrower scope (`namespace list`, `table list`) use only `-n` and `--namespace`.

```bash
# Using the colon delimiter
dataos-ctl lakehouse table get -i my-lakehouse:sales:orders

# Using the pipe delimiter
dataos-ctl lakehouse table get -i 'my-lakehouse|sales|orders'

# Using individual flags
dataos-ctl lakehouse table get -n my-lakehouse --namespace sales --table orders
```

***

### Namespaces

#### namespace list

List all namespaces in a Lakehouse.

```bash
dataos-ctl lakehouse namespace list -n ${lakehouse-name}
```

**Example:**

```bash
dataos-ctl lakehouse namespace list -n my-lakehouse
```

***

### Tables

#### table list

List tables in a namespace. Requires `-n` and `--namespace`.

```bash
dataos-ctl lakehouse table list -n ${lakehouse-name} --namespace ${namespace}
```

**Example:**

```bash
dataos-ctl lakehouse table list -n my-lakehouse --namespace sales
```

#### table get

Get details of a specific table. Use `-d` for full JSON output.

```bash
dataos-ctl lakehouse table get -i ${lakehouse}:${namespace}:${table}
dataos-ctl lakehouse table get -i ${lakehouse}:${namespace}:${table} -d
```

**Example:**

```bash
dataos-ctl lakehouse table get -i my-lakehouse:sales:orders
```

#### table create

Create a table from a YAML manifest. Requires `-f` with the manifest path.

```bash
dataos-ctl lakehouse table create \
  -i ${lakehouse}:${namespace}:${table} \
  -f ${manifest-file-path}
```

**Example:**

```bash
dataos-ctl lakehouse table create \
  -i my-lakehouse:sales:orders \
  -f datasets/create-mapping-partitioned.yaml
```

See [Table Create Manifest](#table-create-manifest) for the manifest format.

***

### Schema

All schema commands require a table identifier in `lakehouse:namespace:table` format.

#### schema get

```bash
dataos-ctl lakehouse table schema get -i ${lakehouse}:${namespace}:${table}
```

#### schema add-field

Add a new field to the table schema.

```bash
# Simple field
dataos-ctl lakehouse table schema add-field \
  -i ${lakehouse}:${namespace}:${table} \
  --field-name ${field-name} --type ${type}

# Decimal with precision and scale
dataos-ctl lakehouse table schema add-field \
  -i ${lakehouse}:${namespace}:${table} \
  --field-name ${field-name} --type decimal -p ${precision} -s ${scale}

# Map field
dataos-ctl lakehouse table schema add-field \
  -i ${lakehouse}:${namespace}:${table} \
  --field-name ${field-name} --type map -k ${key-type} -v ${value-type}
```

| Flag                | Required | Description                                                                                                                          |
| ------------------- | -------- | ------------------------------------------------------------------------------------------------------------------------------------ |
| `--field-name`      | Yes      | Field name                                                                                                                           |
| `--type`            | Yes      | Field type: `string`, `int`, `long`, `float`, `double`, `boolean`, `date`, `timestamp`, `decimal`, `binary`, `list`, `map`, `struct` |
| `-p`, `--precision` | No       | Precision for `decimal` type                                                                                                         |
| `-s`, `--scale`     | No       | Scale for `decimal` type                                                                                                             |
| `-k`, `--keyType`   | No       | Key type (required when `type` is `map`)                                                                                             |
| `-v`, `--valueType` | No       | Value type (required when `type` is `map`)                                                                                           |

#### schema drop-field

```bash
dataos-ctl lakehouse table schema drop-field \
  -i ${lakehouse}:${namespace}:${table} --field-name ${field-name}
```

#### schema rename-field

```bash
dataos-ctl lakehouse table schema rename-field \
  -i ${lakehouse}:${namespace}:${table} \
  --field-name ${field-name} --new-name ${new-name}
```

#### schema update-field

Update the type of an existing field (type promotion only).

```bash
dataos-ctl lakehouse table schema update-field \
  -i ${lakehouse}:${namespace}:${table} --field-name ${field-name} --type ${new-type}
```

#### schema set-nullable

```bash
# Make field nullable (default)
dataos-ctl lakehouse table schema set-nullable \
  -i ${lakehouse}:${namespace}:${table} --field-name ${field-name}

# Make field non-nullable
dataos-ctl lakehouse table schema set-nullable \
  -i ${lakehouse}:${namespace}:${table} --field-name ${field-name} --nullable=false
```

***

### Partitions

#### partition get

Get partition spec and statistics for a table.

```bash
dataos-ctl lakehouse table partition get -i ${lakehouse}:${namespace}:${table}
dataos-ctl lakehouse table partition get -i ${lakehouse}:${namespace}:${table} -d
```

#### partition update

Replace the partition spec from a YAML manifest. Requires `-f`.

```bash
dataos-ctl lakehouse table partition update \
  -i ${lakehouse}:${namespace}:${table} \
  -f ${manifest-file-path}
```

**Example:**

```bash
dataos-ctl lakehouse table partition update \
  -i my-lakehouse:sales:orders \
  -f partitions/update-partition-bucket.yaml
```

See [Partition Update Manifest](#partition-update-manifest) for the manifest format.

***

### Snapshots

#### snapshot list

```bash
# List all snapshots
dataos-ctl lakehouse table snapshot list -i ${lakehouse}:${namespace}:${table}

# List snapshots on a specific branch
dataos-ctl lakehouse table snapshot list -i ${lakehouse}:${namespace}:${table} --branch ${branch-name}

# Detailed output
dataos-ctl lakehouse table snapshot list -i ${lakehouse}:${namespace}:${table} -d
```

#### snapshot set

Set the current snapshot to a specific snapshot ID.

```bash
dataos-ctl lakehouse table snapshot set \
  -i ${lakehouse}:${namespace}:${table} --sid ${snapshot-id}
```

#### snapshot rollback

Roll back the table to a previous snapshot.

```bash
dataos-ctl lakehouse table snapshot rollback \
  -i ${lakehouse}:${namespace}:${table} --sid ${snapshot-id}
```

#### snapshot cherrypick

Cherry-pick a specific snapshot.

```bash
dataos-ctl lakehouse table snapshot cherrypick \
  -i ${lakehouse}:${namespace}:${table} --sid ${snapshot-id}
```

***

### Branches

#### branch list

```bash
dataos-ctl lakehouse table branch list -i ${lakehouse}:${namespace}:${table}
dataos-ctl lakehouse table branch list -i ${lakehouse}:${namespace}:${table} -d
```

#### branch create

```bash
# Create from the latest snapshot
dataos-ctl lakehouse table branch create \
  -i ${lakehouse}:${namespace}:${table} --branch ${branch-name}

# Create from a specific snapshot
dataos-ctl lakehouse table branch create \
  -i ${lakehouse}:${namespace}:${table} --branch ${branch-name} --sid ${snapshot-id}
```

#### branch delete

```bash
dataos-ctl lakehouse table branch delete \
  -i ${lakehouse}:${namespace}:${table} --branch ${branch-name}
```

#### branch rename

```bash
dataos-ctl lakehouse table branch rename \
  -i ${lakehouse}:${namespace}:${table} --branch ${branch-name} --new-name ${new-name}
```

#### branch replace

Replace a branch with another branch or a specific snapshot.

```bash
# Replace with another branch
dataos-ctl lakehouse table branch replace \
  -i ${lakehouse}:${namespace}:${table} --branch ${target-branch} --source ${source-branch}

# Replace with a specific snapshot
dataos-ctl lakehouse table branch replace \
  -i ${lakehouse}:${namespace}:${table} --branch ${branch-name} --sid ${snapshot-id}
```

#### branch fastforward

Fast-forward a target branch to match a source branch.

```bash
dataos-ctl lakehouse table branch fastforward \
  -i ${lakehouse}:${namespace}:${table} --source ${source-branch} --target ${target-branch}
```

***

### Metadata

#### metadata get

List metadata versions for a table.

```bash
dataos-ctl lakehouse table metadata get -i ${lakehouse}:${namespace}:${table}
dataos-ctl lakehouse table metadata get -i ${lakehouse}:${namespace}:${table} -d
```

#### metadata set

Roll back or forward to a specific metadata version.

```bash
dataos-ctl lakehouse table metadata set \
  -i ${lakehouse}:${namespace}:${table} --version ${metadata-version}
```

**Example:**

```bash
dataos-ctl lakehouse table metadata set \
  -i my-lakehouse:sales:orders --version v3.metadata.json
```

***

### Properties

#### properties get

```bash
dataos-ctl lakehouse table properties get -i ${lakehouse}:${namespace}:${table}
dataos-ctl lakehouse table properties get -i ${lakehouse}:${namespace}:${table} -d
```

#### properties add

Add or update key-value properties. Use `--kv` (repeatable).

```bash
dataos-ctl lakehouse table properties add \
  -i ${lakehouse}:${namespace}:${table} \
  --kv ${key}=${value} \
  --kv ${key2}=${value2}
```

**Example:**

```bash
dataos-ctl lakehouse table properties add \
  -i my-lakehouse:sales:orders \
  --kv write.format.default=parquet \
  --kv write.parquet.compression-codec=zstd
```

#### properties remove

Remove properties by key. Use `--key` (repeatable).

```bash
dataos-ctl lakehouse table properties remove \
  -i ${lakehouse}:${namespace}:${table} \
  --key ${key} \
  --key ${key2}
```

**Example:**

```bash
dataos-ctl lakehouse table properties remove \
  -i my-lakehouse:sales:orders \
  --key write.format.default \
  --key write.parquet.compression-codec
```

***

## YAML manifest reference

Three commands accept YAML manifests via the `-f` flag: `table create`, `partition update`, and `operation apply`.

### Table create manifest

The root object contains `schema` (required) and `iceberg` (optional).

**Schema types**:

* `mapping`: define fields inline with `name` and `type`.
* `avro`: provide a raw Avro JSON schema string.

**Partition spec types**: `identity`, `year`, `month`, `day`, `hour`, `bucket`.

**Mapping schema**:

```yaml
schema:
  type: mapping
  fields:
    - name: order_id
      type: long
    - name: order_date
      type: {"type": "long", "logicalType": "timestamp-micros"}
    - name: region
      type: string

iceberg:                 # optional
  specs:                 # partition specs
    - type: day
      column: order_date
      name: order_date_day       # optional; auto-generated if omitted
    - type: identity
      column: region
    - type: bucket
      column: customer_id
      name: customer_id_bucket
      numBuckets: 8              # required when type is bucket
  properties:            # optional Iceberg table properties
    write.format.default: parquet
    commit.retry.num-retries: "4"
```

**Avro schema**:

```yaml
schema:
  type: avro
  avro: |
    {
      "type": "record",
      "name": "my_table",
      "fields": [
        {"name": "id", "type": "long"},
        {"name": "name", "type": "string"}
      ]
    }
```

### Partition update manifest

The root is a YAML array of partition specs. An empty array `[]` clears all partitions.

```yaml
- type: day
  column: order_date
  name: order_date_day

- type: identity
  column: region

- type: bucket
  column: customer_id
  name: customer_id_bucket
  numBuckets: 8
```

### Operation apply manifest

Root object with `namespace`, `table`, and `operation` (all required), and `options` (optional). Supports multi-document YAML (`---` separator) to submit multiple operations in one file.

```yaml
namespace: ${namespace}
table: ${table-name}
operation: ${operation-type}
options:
  # key-value options specific to the operation
```

**Supported operations**:

| Operation              | Description                    | Common Options                                                               |
| ---------------------- | ------------------------------ | ---------------------------------------------------------------------------- |
| `COMPACT`              | Compact small data files       | `delete_file_threshold`, `rewrite_all`                                       |
| `COMPACT_DELETE_FILES` | Compact including delete files | `delete_file_threshold`, `compression_factor`, `shuffle_partitions_per_file` |
| `EXPIRE_SNAPSHOTS`     | Remove old snapshots           | `older_than` (ISO 8601), `retain_last`, `max_concurrent_deletes`             |
| `REWRITE_MANIFESTS`    | Optimize manifest files        | `use_caching`                                                                |
| `DROP_TABLE`           | Drop table from catalog        | `purge`, `cleanup_store`                                                     |
| `REMOVE_ORPHAN_FILES`  | Clean unreferenced files       | `older_than` (ISO 8601)                                                      |


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://v2.dataos.info/concepts/resources/lakehouse/command-reference.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
