> For the complete documentation index, see [llms.txt](https://v2.dataos.info/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://v2.dataos.info/concepts/resources/nilus/destinations/dataos-lakehouse/gcp-backed.md).

# GCP-backed

The GCP-backed DataOS Lakehouse destination writes [Apache Iceberg](https://iceberg.apache.org/docs/latest/)-backed datasets into a Lakehouse Depot whose storage layer is Google Cloud Storage (GCS). Nilus uses the same Lakehouse connector for every cloud. The Lakehouse Depot selects the storage backend, not a separate connector type. AWS-backed, Azure-backed, and GCP-backed Lakehouse Depots all share the same pipeline shape, sink address pattern, and table behavior. This page covers the GCP / GCS variant. The [AWS-backed DataOS Lakehouse](/concepts/resources/nilus/destinations/dataos-lakehouse/aws-backed.md) and [Azure-backed DataOS Lakehouse](/concepts/resources/nilus/destinations/dataos-lakehouse/azure-backed.md) variants are documented separately.

## Requirements

Connectivity and credentials must both be in place before the pipeline can run.

### Connectivity

* The Nilus runtime must reach the configured GCS endpoint and the Iceberg REST catalog identified by `METASTORE_URL`.
* The Lakehouse Depot must be defined in DataOS, with `storageType: gcs` and a `spec.gcs` block, and must be reachable from the runtime cluster.

### Lakehouse Depot shape

For a GCS-backed Lakehouse the Depot resource carries:

```yaml
spec:
  storageType: gcs
  gcs:
    bucket: my-iceberg-bucket
    relativePath: warehouse/         # optional sub-prefix
```

The runtime resolves the bucket URL as `gs://<bucket>/<relativePath>`.

Storage credentials projected from the Depot's secret (service-account JSON file is the only supported credential shape):

| Secret key     | Required | Notes                                                                                                  |
| -------------- | -------- | ------------------------------------------------------------------------------------------------------ |
| `gcp_json_key` | Yes      | Service-account JSON key file. The runtime projects it to a file path and points the GCS client at it. |

The service account must have the following GCS roles on the configured bucket / prefix: `roles/storage.objectAdmin` (or an equivalent custom role granting `storage.objects.create`, `storage.objects.delete`, and `storage.objects.list`).

### Permissions

* The Depot's service account must be able to read, write, and delete objects under the configured bucket / prefix.
* The runtime must be authorized to register tables in the Iceberg REST catalog. The Lakehouse Depot secret must include an `apikey` (a DataOS API token). Nilus reads it as `LAKEHOUSE_APIKEY` and passes it to the Iceberg REST catalog as the connection token. If the secret has no `apikey`, the pipeline fails at startup with `'apikey' is required in Lakehouse depot secret.`

## Sink address

Reference the Lakehouse Depot by name; the address format is the same for every cloud backend:

```
dataos://<lakehouse-depot>
```

Authoring a `lakehouse://` URI directly in a manifest is not supported. Always go through the Depot.

## Sink options

| Option                 | Required | Description                                                                                                                                                         |
| ---------------------- | -------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `dest_table`           | Yes      | Target Iceberg table in `<schema>.<table>` form. Exactly two dot-separated parts.                                                                                   |
| `incremental_strategy` | Yes      | One of `replace`, `append`, `merge`.                                                                                                                                |
| `partition_by`         | No       | Iceberg partition spec (list of column / transform pairs). See [Optimize Sink Datasets](/concepts/resources/nilus/pipeline-optimization/optimize-sink-datasets.md). |

> GCP-backed Lakehouses do not take cloud-specific sink options. Bucket name, prefix, and credentials all come from the Depot.

## Sample Nilus configs

Each example below is self-contained and uses the current Nilus pipeline shape.

### Batch ingestion

```yaml
name: nilus-gcp-lakehouse-batch
version: v1alpha
type: nilus
spec:
  type: batch
  compute: universe-compute
  source:
    address: dataos://postgres-source
    options:
      source_table: public.orders
  sink:
    address: dataos://gcp-lakehouse-depot
    options:
      dest_table: sales.orders_snapshot
      incremental_strategy: merge
```

### CDC ingestion

```yaml
name: nilus-gcp-lakehouse-cdc
version: v1alpha
type: nilus
spec:
  type: cdc
  compute: universe-compute
  source:
    address: dataos://mssql-cdc-depot
    cdc:
      table.include.list: "dbo.orders"
      topic.prefix: "orders_cdc"
  sink:
    address: dataos://gcp-lakehouse-depot
    options:
      dest_table: sales.orders_cdc
      incremental_strategy: append
```

## Behavior and capabilities

* **Table format**: Apache Iceberg. Nilus writes Parquet data files to GCS and registers / updates the resulting tables in the Iceberg REST catalog.
* **Object model**: Iceberg tables addressed as `<schema>.<table>`.
* **Supported pipeline modes**: `batch` and `cdc`.
* **Supported incremental strategies**: `replace`, `append`, `merge`.
* **Storage layer**: Google Cloud Storage (GCS).

## Troubleshooting

| Symptom                                                                    | Likely cause                                                                           | Resolution                                                                                                         |
| -------------------------------------------------------------------------- | -------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------ |
| `METASTORE_URL environment variable is required for Lakehouse destination` | The Depot did not project `METASTORE_URL` into the runtime.                            | Re-check the Lakehouse Depot configuration; verify the Iceberg REST catalog endpoint is set on the Depot resource. |
| `Unsupported scheme: <X>. Expected 'lakehouse://'`                         | The manifest authored a non-`dataos://` URI that did not resolve to a Lakehouse Depot. | Always reference the Lakehouse via `dataos://<lakehouse-depot>`.                                                   |
| `Table name must be in the format <schema>.<table>`                        | `dest_table` is missing the schema prefix or has more than two parts.                  | Use exactly `<schema>.<table>`.                                                                                    |
| GCS write fails with `403` / `forbidden`                                   | The Depot's service account lacks object create / delete permission on the bucket.     | Re-check the Depot secret and the IAM bindings on the GCS bucket.                                                  |
| `Could not find credentials file`                                          | The Depot did not project the `gcp_json_key` secret into the runtime correctly.        | Re-check the Depot secret reference and that the project key contains valid service-account JSON.                  |
| Iceberg commit fails with `409 Conflict`                                   | Concurrent writers from another pipeline are competing on the same target table.       | Serialize writes per Iceberg table, or use distinct destination tables per pipeline.                               |

## Related docs

* [AWS-backed DataOS Lakehouse](/concepts/resources/nilus/destinations/dataos-lakehouse/aws-backed.md)
* [Azure-backed DataOS Lakehouse](/concepts/resources/nilus/destinations/dataos-lakehouse/azure-backed.md)
* [Understanding Batch Pipeline Config](/concepts/resources/nilus/batch/pipeline-config.md)
* [Understanding CDC Pipeline Config](/concepts/resources/nilus/cdc/service-config.md)
* [Optimize Sink Datasets](/concepts/resources/nilus/pipeline-optimization/optimize-sink-datasets.md): guidance on `incremental_strategy` and dataset-shape tuning for Iceberg targets.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://v2.dataos.info/concepts/resources/nilus/destinations/dataos-lakehouse/gcp-backed.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
