> For the complete documentation index, see [llms.txt](https://v2.dataos.info/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://v2.dataos.info/concepts/resources/depot/supported-sources/s3.md).

# Simple Storage Service (Amazon S3)

## Pre-requisites specific to DataOS

To create a Depot, you need the Data Admin role. Contact the DataOS Operator or Tenant Admin to get it.

Run this command to see your assigned roles.

```bash
dataos-ctl user get
#Expected Output: 
time="2026-03-30T17:49:18+05:30" level=info msg="😃 user get..."
time="2026-03-30T17:49:19+05:30" level=info msg="😃 user get...complete"

     NAME     │        ID         │  TYPE  │        EMAIL         │                  TAGS
──────────────┼───────────────────┼────────┼──────────────────────┼────────────────────────────────────────
 I Am Groot   │ iamgroottmdcio    │ person │ iam.groot@tmdc.io    │ roles:id:ct-onboarding-data-developer,
              │                   │        │                      │ roles:id:data-dev,
              │                   │        │                      │ roles:id:system-dev,
              │                   │        │                      │ roles:id:user,
              │                   │        │                      │ users:id:iamgroottmdcio

```

## Pre-requisites specific to the S3 Depot

To create an S3 Depot, you must have the following details:

* **AWS access key ID**: The Access Key ID used to authenticate and authorize API requests to your AWS account. This can be obtained from the AWS IAM (Identity and Access Management) Console under your user’s security credentials or requested from your AWS administrator.
* **AWS bucket name**: The name of the Amazon S3 bucket where the data resides. You can find this in the AWS S3 Console under the list of buckets or request it from the administrator managing the storage.
* **Secret access key**: The Secret Access Key associated with your AWS Access Key ID, is required for secure API requests. This is available in the AWS IAM Console under your user’s security credentials. Ensure that it is securely stored and shared only with authorized personnel.
* **Scheme**: The scheme specifies the protocol to be used for the connection, such as `s3` or `https`. This information depends on your system’s configuration and can be confirmed with the team managing the connection setup.
* **Relative Path**: The path within the S3 bucket that points to the specific data or folder you want to access. This path is typically structured according to how your data is organized and can be obtained from the team managing the data or the AWS S3 Console.
* **Data format (`format`)**: `format` specifies the type of table format used to store the data in the container. Common values are `iceberg` or `delta`, depending on how the data is organized.
* **Region**: The AWS region where the S3 bucket is hosted (e.g., us-east-1, us-gov-east-1). If the region is not specified, DataOS will use the default AWS region.
* **Endpoint**: The custom or standard S3 endpoint used to access the storage. Required when working with non-default AWS regions or S3-compatible services. Example: ${{s3.us-gov-east-1.amazonaws.com}}

## Create an S3 Depot

Amazon S3 is an object storage system: a distributed store for large volumes of unstructured data.

A Depot of type `s3` reads data from Amazon S3 buckets. The Depot exposes the configured bucket and path for scan, query, and read/write operations. To create an S3 Depot, follow these steps:

### **Step 1: Create a Secret for securing S3 credentials**

Create a Secret Resource using the [Simple Storage Service (Amazon S3)](/concepts/resources/secret/data-sources/simple-storage-service-amazon-s3.md) guide.

### **Step 2: Create a S3 Depot manifest file**

Create a manifest file for your S3 Depot.

{% tabs %}
{% tab title="Manifest file" %}

```yaml
name: ${{s3-name}}
version: v2alpha
type: depot
tags:
  - S3
  - depot
  - storage
description: "Amazon S3 cloud storage depot"
spec:
  type: s3
  spec:
    bucket: ${{s3-bucket}}
    relativePath: ${{s3-relative-path}}
    format: ${{s3-format}}
    scheme: s3a
  secrets:
    - id: "${{tenant-id}}:${{aws-secret-name}}"
      purpose: scan
```

{% endtab %}

{% tab title="Example" %}

```yaml
name: ${{s3-name}}
version: v2alpha
type: depot
tags:
  - S3
  - depot
  - storage
description: "Amazon S3 cloud storage depot"
spec:
  type: s3
  spec:
    bucket: ${{s3-bucket}}
    relativePath: ${{s3-relative-path}}
    format: ${{s3-format}}
    scheme: s3a
  secrets:
    - id: "${{tenant-id}}:${{aws-secret-name}}"
      purpose: scan
```

{% endtab %}
{% endtabs %}

### **Step 3: Apply the Depot manifest file**

Apply the manifest with the DataOS CLI:

```bash
dataos-ctl resource apply -f ${{manifest-file-path}}
```

## Verify the Depot creation

Verify the Depot in two ways:

* List Depots where you are the owner:

  ```bash
  dataos-ctl resource get -t depot
  ```
* List all Depots in the current Tenant:

  ```bash
  dataos-ctl resource get -t depot -a
  ```

## Delete a Depot

{% hint style="warning" %}
Best practice: Delete Resources that are no longer in use to save time and reduce costs.
{% endhint %}

To delete a Depot, use the DataOS CLI:

{% tabs %}
{% tab title="Command 1" %}

```bash
dataos-ctl resource delete -t depot -n ${{name}}
```

{% endtab %}

{% tab title="Command 2 " %}

```bash
dataos-ctl resource delete -i "${{resource-name}}|v2alpha|depot"
```

{% endtab %}

{% tab title="Command 3" %}

```bash
dataos-ctl resource delete -f ${{manifest-file-path}}
```

{% endtab %}
{% endtabs %}

Specify the Resource type and Depot name in the `delete` command.

**Example:**

{% tabs %}
{% tab title="Command 1" %}

```bash
dataos-ctl resource delete -t depot -n testdepot
#output
time="2026-03-25T15:53:55+05:30" level=info msg="🗑 delete..."
time="2026-03-25T15:53:55+05:30" level=info msg="🗑 deleting testdepot:v2alpha:depot..."
time="2026-03-25T15:53:56+05:30" level=info msg="🗑 deleting testdepot:v2alpha:depot...deleted"
time="2026-03-25T15:53:56+05:30" level=info msg="🗑 delete...complete"
time="2026-03-25T15:53:56+05:30" level=info msg="🗑 delete...complete"
```

{% endtab %}

{% tab title="Command 2" %}

```bash
dataos-ctl resource delete -i "testdepot|v2alpha|depot"
#output
time="2026-03-25T15:55:37+05:30" level=info msg="🗑 delete..."
time="2026-03-25T15:55:37+05:30" level=info msg="🗑 deleting testdepot:v2alpha:depot..."
time="2026-03-25T15:55:37+05:30" level=info msg="🗑 deleting testdepot:v2alpha:depot...deleted"
time="2026-03-25T15:55:37+05:30" level=info msg="🗑 delete...complete"
```

{% endtab %}

{% tab title="Command 3" %}

```bash
dataos-ctl resource delete -f /path/to/depot.yaml
#output
time="2026-03-25T15:53:55+05:30" level=info msg="🗑 delete..."
time="2026-03-25T15:53:55+05:30" level=info msg="🗑 deleting testdepot:v2alpha:depot..."
time="2026-03-25T15:53:56+05:30" level=info msg="🗑 deleting testdepot:v2alpha:depot...deleted"
time="2026-03-25T15:53:56+05:30" level=info msg="🗑 delete...complete"
```

{% endtab %}
{% endtabs %}

## Limit the data source's file format

A Depot can also limit the file types it reads and writes. Set the `format` field in the `spec` section to the file format you want to allow.

```yaml
spec:
  type: s3
  description: ${{description}}
  spec:
    scheme: ${{s3a}}
    bucket: ${{bucket-name}}
    relativePath: "raw" 
    format: ${{format}}  # mention the file format, such as delta, iceberg, parquet, etc
```

For file-based systems, when the format is `iceberg`, choose the meta-store catalog as Hadoop or Hive:

```yaml
spec:
  type: s3
  description: "S3 Iceberg Depot for sanity"
  spec:
    bucket: 
    relativePath:
    format: iceberg
    icebergCatalogType: Hive
```

If you do not set the catalog to Hive, DataOS uses Hadoop as the default catalog for the Iceberg format. Hive keeps the pointer updated to the latest metadata version. With Hadoop, run the set metadata command manually as described in [Set Metadata](broken://pages/7mdiOZ53ZSi81YHHX3Hb).


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://v2.dataos.info/concepts/resources/depot/supported-sources/s3.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
