> For the complete documentation index, see [llms.txt](https://v2.dataos.info/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://v2.dataos.info/concepts/resources/depot/supported-sources/lakehouse.md).

# Lakehouse

A Lakehouse Depot connects DataOS to an Iceberg-formatted data lakehouse stored on cloud object storage. It abstracts the underlying storage backend — Azure Blob File System Secure (ABFSS), Google Cloud Storage (GCS), or Amazon S3 — behind a unified depot interface, exposing the tables for scan, query, and read/write operations through a REST catalog and a metastore.

## Pre-requisites specific to DataOS

To create a Depot, you need the Data Admin role. Contact the DataOS Operator or Tenant Admin to get it.

{% hint style="info" %}
**Access through this Depot replaces direct Lakehouse access**

When you create a Lakehouse Depot, the Secrets you embed in it — containing API tokens of authorized [users](/concepts/foundations/access-control-landscape/runasuser-permissions.md) — become the access boundary for anyone querying through this Depot. Users who hold `Can Use Depot` on this Depot and `Can Use Secret` on the linked Secret can read or write to the Lakehouse.
{% endhint %}

Run this command to see your assigned roles.

```bash
dataos-ctl user get
#Expected Output: 
time="2026-03-30T17:49:18+05:30" level=info msg="😃 user get..."
time="2026-03-30T17:49:19+005:30" level=info msg="😃 user get...complete"

     NAME     │        ID         │  TYPE  │        EMAIL         │                  TAGS
──────────────┼───────────────────┼────────┼──────────────────────┼────────────────────────────────────────
 I Am Groot   │ iamgroottmdcio    │ person │ iam.groot@tmdc.io    │ roles:id:ct-onboarding-data-developer,
              │                   │        │                      │ roles:id:user,
              │                   │        │                      │ users:id:iamgroottmdcio

```

## Pre-requisites specific to the source system

The credentials you need depend on the storage backend you choose for your Lakehouse Depot.

<details>

<summary>ABFSS</summary>

* **Storage Account Name (`account`)**: The name of the Azure Storage account. Available in the Azure portal under your storage account settings, or from your Azure administrator.
* **Storage Account Key (`az_account_key`)**: The access key that authenticates requests to the Azure Storage account. Retrieve it from the Azure portal under the "Access keys" section of your storage account.
* **Container**: The name of the Azure Blob Storage container that holds the Iceberg data.
* **Endpoint Suffix (`endpointSuffix`)**: The DNS suffix for the Azure storage endpoint, for example `dfs.core.windows.net`. Confirm with your Azure administrator.
* **Relative Path (`relativePath`)**: The path within the container that points to the Iceberg data root.
* **Data format (`format`)**: The table format — set to `iceberg` for a Lakehouse Depot.
* **Metastore URL (`metastoreUrl`)**: The URL of the REST catalog or Hive Metastore that manages Iceberg table metadata.
* **Metastore Relative Path (`metastoreRelativePath`)**: The relative path within the metastore used to resolve table locations.

</details>

<details>

<summary>GCS</summary>

* **GCS Bucket (`bucket`)**: The name of the Google Cloud Storage bucket that holds the Iceberg data. Find it in the Google Cloud Console under Storage.
* **GCP Service Account JSON Key (`gcp_json_key`)**: The JSON key file for a GCP service account with read/write access to the bucket. Generate it in IAM & Admin > Service Accounts > Add Key > JSON.
* **Relative Path (`relativePath`)**: The path within the bucket that points to the Iceberg data root.
* **Data format (`format`)**: The table format — set to `iceberg` for a Lakehouse Depot.
* **Metastore URL (`metastoreUrl`)**: The URL of the REST catalog or Hive Metastore that manages Iceberg table metadata.
* **Metastore Relative Path (`metastoreRelativePath`)**: The relative path within the metastore used to resolve table locations.

</details>

<details>

<summary>S3</summary>

* **S3 Bucket (`bucket`)**: The name of the Amazon S3 bucket that holds the Iceberg data. Find it in the AWS S3 Console.
* **AWS Access Key ID (`aws_access_key`)**: The access key used to authenticate AWS API requests. Obtain it from the AWS IAM Console under your user's security credentials.
* **AWS Secret Access Key (`aws_secret_key`)**: The secret key paired with the Access Key ID. Available in the AWS IAM Console under your user's security credentials.
* **Relative Path (`relativePath`)**: The path within the S3 bucket that points to the Iceberg data root.
* **Data format (`format`)**: The table format — set to `iceberg` for a Lakehouse Depot.
* **Metastore URL (`metastoreUrl`)**: The URL of the REST catalog or Hive Metastore that manages Iceberg table metadata.
* **Metastore Relative Path (`metastoreRelativePath`)**: The relative path within the metastore used to resolve table locations.

</details>

## Create a Lakehouse Depot

A Lakehouse Depot stores Iceberg tables on cloud object storage and exposes them through a REST catalog. Each Depot maps to a single storage backend (ABFSS, GCS, or S3). Follow the steps for your chosen backend.

### **Step 1: Create a Secret for securing Lakehouse credentials**

Create a Secret Resource for your storage backend before creating the Depot:

* **ABFSS** — follow the [Azure Blob File System Secure (ABFSS)](/concepts/resources/secret/data-sources/azure-blob-file-system-secure-abfss.md) guide.
* **GCS** — follow the [Google Cloud Storage (GCS)](/concepts/resources/secret/data-sources/google-cloud-storage-gcs.md) guide.
* **S3** — follow the [Simple Storage Service (Amazon S3)](/concepts/resources/secret/data-sources/simple-storage-service-amazon-s3.md) guide.

### **Step 2: Create a Lakehouse Depot manifest file**

Create a manifest file for your Lakehouse Depot. Use the template for the storage backend you chose.

{% tabs %}
{% tab title="ABFSS" %}

```yaml
version: v2alpha
name: ${{lakehouse-abfss-depot-name}}
type: depot
description: "Default Iceberg Data Depot backed by Azure ABFSS"
depot:
  type: lakehouse
  spec:
    storageType: abfss
    catalogType: REST
    metastoreUrl: ${{lakehouse-abfss-metastore-url}}
    metastoreRelativePath: ${{lakehouse-abfss-metastore-relative-path}}
    abfss:
      account: ${{azure-storage-account-name}}
      container: ${{azure-container-name}}
      endpointSuffix: ${{azure-endpoint-suffix}}
      relativePath: ${{abfss-relative-path}}
      format: ${{data-format}}
  secrets:
    - id: "${{tenant-id}}:${{lakehouse-abfss-secret-name}}"
      purpose: rw
```

{% endtab %}

{% tab title="GCS" %}

```yaml
version: v2alpha
name: ${{lakehouse-gcs-depot-name}}
type: depot
description: "Default Iceberg Data Depot backed by GCS"
depot:
  type: lakehouse
  spec:
    storageType: gcs
    catalogType: REST
    metastoreUrl: ${{lakehouse-gcs-metastore-url}}
    metastoreRelativePath: ${{lakehouse-gcs-metastore-relative-path}}
    gcs:
      bucket: ${{gcs-bucket-name}}
      relativePath: ${{gcs-relative-path}}
      format: ${{data-format}}
  secrets:
    - id: "${{tenant-id}}:${{lakehouse-gcs-secret-name}}"
      purpose: rw
```

{% endtab %}

{% tab title="S3" %}

```yaml
version: v2alpha
name: ${{lakehouse-s3-depot-name}}
type: depot
description: "Default Iceberg Data Depot backed by Amazon S3"
depot:
  type: lakehouse
  spec:
    storageType: s3
    catalogType: REST
    metastoreUrl: ${{lakehouse-s3-metastore-url}}
    metastoreRelativePath: ${{lakehouse-s3-metastore-relative-path}}
    s3:
      bucket: ${{s3-bucket-name}}
      relativePath: ${{s3-relative-path}}
      format: ${{data-format}}
  secrets:
    - id: "${{tenant-id}}:${{lakehouse-s3-secret-name}}"
      purpose: rw
```

{% endtab %}
{% endtabs %}

### **Step 3: Apply the Depot manifest file**

Apply the manifest with the DataOS CLI:

```bash
dataos-ctl resource apply -f ${{depot-manifest-file-path}}
```

## Verify the Depot creation

Verify the Depot in two ways:

* List Depots where you are the owner:

  ```bash
  dataos-ctl resource get -t depot
  ```
* List all Depots in the current Tenant:

  ```bash
  dataos-ctl resource get -t depot -a
  ```

## Delete a Depot

{% hint style="warning" %}
Best practice: Delete Resources that are no longer in use.
{% endhint %}

To delete a Depot, use the DataOS CLI:

{% tabs %}
{% tab title="Command 1" %}

```bash
dataos-ctl resource delete -t depot -n ${{name}}
```

{% endtab %}

{% tab title="Command 2 " %}

```bash
dataos-ctl resource delete -i "${{resource-name}}|v2alpha|depot"
```

{% endtab %}

{% tab title="Command 3" %}

```bash
dataos-ctl resource delete -f ${{manifest-file-path}}
```

{% endtab %}
{% endtabs %}

Specify the Resource type and Depot name in the `delete` command.

**Example:**

{% tabs %}
{% tab title="Command 1" %}

```bash
dataos-ctl resource delete -t depot -n lakehouse-depot
#output
time="2026-03-25T15:53:55+05:30" level=info msg="🗑 delete..."
time="2026-03-25T15:53:55+05:30" level=info msg="🗑 deleting lakehouse-depot:v2alpha:depot..."
time="2026-03-25T15:53:56+05:30" level=info msg="🗑 deleting lakehouse-depot:v2alpha:depot...deleted"
time="2026-03-25T15:53:56+05:30" level=info msg="🗑 delete...complete"
```

{% endtab %}

{% tab title="Command 2" %}

```bash
dataos-ctl resource delete -i "lakehouse-depot|v2alpha|depot"
#output
time="2026-03-25T15:55:37+05:30" level=info msg="🗑 delete..."
time="2026-03-25T15:55:37+05:30" level=info msg="🗑 deleting lakehouse-depot:v2alpha:depot..."
time="2026-03-25T15:55:37+05:30" level=info msg="🗑 deleting lakehouse-depot:v2alpha:depot...deleted"
time="2026-03-25T15:55:37+05:30" level=info msg="🗑 delete...complete"
```

{% endtab %}

{% tab title="Command 3" %}

```bash
dataos-ctl resource delete -f /path/to/depot.yaml
#output
time="2026-03-25T15:53:55+05:30" level=info msg="🗑 delete..."
time="2026-03-25T15:53:55+05:30" level=info msg="🗑 deleting lakehouse-depot:v2alpha:depot..."
time="2026-03-25T15:53:56+05:30" level=info msg="🗑 deleting lakehouse-depot:v2alpha:depot...deleted"
time="2026-03-25T15:53:56+05:30" level=info msg="🗑 delete...complete"
```

{% endtab %}
{% endtabs %}


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://v2.dataos.info/concepts/resources/depot/supported-sources/lakehouse.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
