> For the complete documentation index, see [llms.txt](https://v2.dataos.info/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://v2.dataos.info/build/readme/data-connection.md).

# Data Connection

Before DataOS can read or transform your data, it needs a registered, authenticated connection to the source system. This page covers how to check whether one already exists and how to create one if it doesn't.

A data connection is two Resources working together:

* **Secret** - Stores credentials (username, password, API keys, certificates) in an encrypted vault. Credentials never sit in manifest files or code. See [Secret](/concepts/resources/secret.md) in Concepts.
* **Depot** - Connects DataOS to the external system and references the Secret for authentication. See [Depot](/concepts/resources/depot.md) in Concepts.

{% hint style="info" %}
**Data connection vs. engine**

A **data connection** (Secret + Depot) answers: *Where does the source data live, and how do we authenticate to it?* It makes an external system reachable inside DataOS without moving data.

An **engine** (Snowflake, Databricks, BigQuery, Postgres, Trino, Spark) answers: *Where do transformations run, and where do product outputs land?* You configure an engine in your project settings file after the data connection is in place.

The same system can play both roles. A Snowflake account, for example, can be a Depot (read source tables) and the engine (run transforms, write outputs). Each role is configured separately.
{% endhint %}

## Before you begin

You'll need:

* **DataOS CLI installed and initialized.** If you haven't done that yet, run [CLI setup](/build/readme/cli-setup.md) first.
* **Permission to create Secrets and Depots.** Confirm with your DataOS administrator, or check yourself:

  ```bash
  dataos-ctl user get
  ```

{% hint style="warning" %}
Without the right role, applying a Secret or Depot fails with a permission error. Role names vary by tenant. Ask your administrator before retrying.
{% endhint %}

{% stepper %}
{% step %}

## Check whether a connection already exists

Many teams share Depots, so a connection to your source may already be live. Don't create a duplicate.

### Check from the CLI

List every Depot in your tenant:

```bash
dataos-ctl resource get -t depot -a
```

Example output:

```bash
INFO[0000] 🔍 resource get...
INFO[0000] 🔍 resource get...complete

      NAME           | VERSION |  TYPE  | STATUS | RUNTIME |       OWNER
---------------------+---------+--------+--------+---------+-------------------
 snowflake-sales     | v2alpha | depot  | active |         | johndoetmdcio
 s3-raw-events       | v2alpha | depot  | active |         | johndoetmdcio
 postgres-crm        | v2alpha | depot  | active |         | johndoetmdcio
```

If a Depot for your source is already `active`, note its name and use it directly via UDL:

```
dataos://[depot-name]/[source-path]
```

Skip the rest of this page and go straight to [Choose your engine](/build/choose-your-engine/overview.md).

### Check from the UI

Open the Datasets section in the left navigation and look for your source in the source tree. If your source appears with datasets listed, a Depot is already active.

For a walkthrough, see [Explore metadata](/build/stage-1-discover/inspect-metadata/explore-metadata.md).

{% hint style="info" %}
Not sure if a Depot exists or what it's called? Ask your DataOS administrator or the team that owns your tenant's data connections.
{% endhint %}

If no Depot covers your source, continue.
{% endstep %}

{% step %}

## Create a Secret

A Secret holds the credentials DataOS uses to authenticate to your source.

Gather the credentials before you start. The fields you need depend on the source. For source-specific examples, see [Secrets data sources](/concepts/resources/secret/data-sources.md) in Concepts.

### 1. Write the Secret manifest

Create a YAML file. The example uses a username and password:

```yaml
name: ${{secret-name}}
version: v2alpha
type: secret
tags:
  - ${{tag-1}}
description: ${{description of what this secret is for}}
owner: ${{your-dataos-user-id}}
secret:
  type: key-value
  data:
    username: ${{source-username}}
    password: ${{source-password}}
```

For file-based credentials (BigQuery JSON key, Snowflake RSA private key), use `secret.files` instead of `secret.data`:

```yaml
secret:
  type: key-value
  files:
    json_keyfile: /path/to/keyfile.json
```

{% hint style="warning" %}
Don't commit Secret manifests to version control. The `secret.data` block is plaintext. After the Secret is applied, delete the local file or move it outside your repo.
{% endhint %}

### 2. Apply the Secret

```bash
dataos-ctl resource apply -f ${{path-to-secret.yaml}}
```

Example output:

```bash
INFO[0000] 🛠 apply...
INFO[0000] 🔧 applying snowflake-cred:v2alpha:secret...
INFO[0004] 🔧 applying snowflake-cred:v2alpha:secret...created
INFO[0004] 🛠 apply...complete
```

### 3. Confirm the Secret was created

```bash
dataos-ctl resource get -t secret
```

Example output:

```bash
INFO[0000] 🔍 get...
INFO[0000] 🔍 get...complete

      NAME        | VERSION |  TYPE  | STATUS | RUNTIME |      OWNER
------------------+---------+--------+--------+---------+-------------------
 snowflake-cred   | v2alpha | secret | active |         | johndoetmdcio
```

Note the Secret's name. The Depot manifest references it next.
{% endstep %}

{% step %}

## Create a Depot

A Depot registers the source inside DataOS and points at the Secret. Once active, DataOS gives it a **Uniform Data Link (UDL)**:

```
dataos://[depot-name]/[source-path]
```

Every tool in DataOS uses this address. No credentials at the point of use.

### 1. Write the Depot manifest

This shape works for most relational and object-store sources. The `spec.type` and `spec.spec` blocks change per source:

```yaml
name: ${{depot-name}}
version: v2alpha
type: depot
tags:
  - ${{tag-1}}
description: ${{description of what this depot connects to}}
owner: ${{your-dataos-user-id}}
layer: user
tenant: ${{your-tenant-name}}
spec:
  type: ${{SOURCE_TYPE}}          # e.g. SNOWFLAKE, S3, POSTGRES, BIGQUERY
  spec:
    # source-specific connection fields go here
  secrets:
    - id: "${{tenant-name}}:${{secret-name}}"
      purpose: rw
    - id: "${{tenant-name}}:${{secret-name}}"
      purpose: scan
    - id: "${{tenant-name}}:${{secret-name}}"
      purpose: query
```

The `secrets` block ties the Depot to the Secret you just created. Each entry declares a purpose:

| Purpose | What it covers                                         |
| ------- | ------------------------------------------------------ |
| `rw`    | Read and write operations (ingestion, transformations) |
| `scan`  | Metadata scanning (catalog, lineage)                   |
| `query` | Interactive querying via Workbench or BI tools         |

You can reuse the same Secret for all three purposes, or split them if your source needs different credentials per operation.

**Snowflake example:**

```yaml
name: snowflake-sales
version: v2alpha
type: depot
tags:
  - snowflake
  - sales-data
description: Snowflake depot for the sales schema
owner: johndoetmdcio
layer: user
tenant: engineering
spec:
  type: snowflake
  spec:
    url: ${{account-url}}          # e.g. https://abc123.snowflakecomputing.com
    database: ${{database-name}}   # e.g. SALES_DB
    warehouse: ${{warehouse-name}} # e.g. COMPUTE_WH
    role: ${{role-name}}           # optional; defaults to PUBLIC
  secrets:
    - id: "engineering:snowflake-cred"
      purpose: rw
    - id: "engineering:snowflake-cred"
      purpose: scan
    - id: "engineering:snowflake-cred"
      purpose: query
```

For manifests for other sources (S3, BigQuery, PostgreSQL, Kafka, and more), see [supported Depots ](/concepts/resources/depot/supported-sources.md)in Concepts.

### 2. Apply the Depot manifest

```bash
dataos-ctl resource apply -f ${{path-to-depot.yaml}}
```

Example output:

```bash
INFO[0000] 🛠 apply...
INFO[0000] 🔧 applying snowflake-sales:v2alpha:depot...
INFO[0004] 🔧 applying snowflake-sales:v2alpha:depot...created
INFO[0004] 🛠 apply...complete
```

{% endstep %}

{% step %}

## Verify the connection

Confirm both Resources are created and active.

### Check the Depot

Your Depots:

```bash
dataos-ctl resource get -t depot
```

Every Depot in the tenant:

```bash
dataos-ctl resource get -t depot -a
```

Example output:

```bash
INFO[0000] 🔍 resource get...
INFO[0000] 🔍 resource get...complete

      NAME         | VERSION |  TYPE  | STATUS | RUNTIME |       OWNER
-------------------+---------+--------+--------+---------+-------------------
 snowflake-sales   | v2alpha | depot  | active |         | johndoetmdcio
```

A `STATUS` of `active` means the Depot Service has registered it.

### Access data via UDL

Once active, reference your data through the UDL. The Depot absorbs top-level connection details (account, database), and the path that follows mirrors the source's native hierarchy.

For the Snowflake example above (configured for `SALES_DB`), the `orders` table in the `public` schema is:

```
dataos://snowflake-sales/public/orders
```

Use this address in ingestion jobs, transformation pipelines, and the Workbench query tool.
{% endstep %}
{% endstepper %}

## Troubleshooting

<details>

<summary>Permission error when applying a Secret or Depot</summary>

```
ERROR: forbidden: user does not have required role data-dev
```

You don't have the Data Admin role. Ask your DataOS administrator to assign `roles:id:data-dev`, then retry.

</details>

<details>

<summary>Depot status is not <code>active</code> after applying</summary>

The Depot probably failed to validate the connection. Check the details:

```bash
dataos-ctl resource get -t depot -n ${{depot-name}} --details
```

Common causes:

* Wrong credentials in the Secret. Update the manifest and re-apply.
* Incorrect host, port, or database name in `spec.spec`.
* Network or firewall restrictions between DataOS and the source. Talk to your infrastructure team.

</details>

<details>

<summary>Secret deletion blocked by a dependent Depot</summary>

```
ERROR: cannot delete resource, it is a dependency of 'depot:v2alpha:snowflake-sales'
```

Delete the Depot first, then the Secret:

```bash
dataos-ctl resource delete -t depot -n ${{depot-name}}
dataos-ctl resource delete -t secret -n ${{secret-name}}
```

</details>


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://v2.dataos.info/build/readme/data-connection.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
