> For the complete documentation index, see [llms.txt](https://v2.dataos.info/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://v2.dataos.info/build/stage-2-productize/connect-to-engine/databricks.md).

# Databricks

Databricks is useful for large-scale data engineering, machine learning, and collaborative analytics. Vulcan integrates with Databricks to manage transformations using Unity Catalog and Delta Lake.

***

### Engine adapter type

For Databricks, you configure a gateway with the following adapter type:

```yaml
type: databricks
```

Supports Vulcan local/built-in scheduling.

***

### Before you start

Make sure you have:

* A Databricks workspace with SQL warehouse or cluster access
* A personal access token or service principal credentials
* The HTTP path to your SQL warehouse or cluster

> The HTTP path can be found in Databricks under **SQL Warehouses -> \[Your Warehouse] -> Connection Details**.

***

### Required permissions

Vulcan requires the following Databricks permissions:

| Permission                                             | Required for                               |
| ------------------------------------------------------ | ------------------------------------------ |
| `USE CATALOG` on the target catalog                    | Use the target catalog                     |
| `USE SCHEMA` and `CREATE SCHEMA` on the target schemas | Use and create schemas                     |
| `CREATE TABLE` and `CREATE VIEW` on schemas            | Create tables and views                    |
| `SELECT`, `INSERT`, `UPDATE`, `DELETE` on tables       | Read, write, update, and delete table data |

***

### Required connection options

Use these fields when setting up a Databricks gateway:

| Option           | Description                                                                     |
| ---------------- | ------------------------------------------------------------------------------- |
| `type`           | Engine type name - must be `databricks`                                         |
| `serverHostname` | The Databricks workspace hostname, for example, `adb-xxxxx.azuredatabricks.net` |
| `httpPath`       | The HTTP path to the SQL warehouse or cluster                                   |
| `accessToken`    | Personal access token or service principal token for authentication             |
| `catalog`        | The Unity Catalog name to use as the default catalog                            |

***

### Authentication methods

Databricks supports the following authentication methods for this gateway:

| Method                                 | How to configure                               |
| -------------------------------------- | ---------------------------------------------- |
| Personal access token authentication   | Use `accessToken`                              |
| Service principal token authentication | Use `accessToken` with service principal token |

***

### Example configuration

Add a Databricks gateway to your Vulcan project configuration.

```yaml
gateways:
  databricks:
    type: databricks
    serverHostname: <databricks-workspace-hostname>
    httpPath: <sql-warehouse-or-cluster-http-path>
    accessToken: "{{ env_var('DATABRICKS_TOKEN') }}"
    catalog: <catalog-name>
```

> Never commit your access token to version control. Use environment variables:
>
> ```yaml
> accessToken: "{{ env_var('DATABRICKS_TOKEN') }}"
> ```

***

### Materialization behavior

Databricks uses the following materialization strategies depending on the model kind:

| Model kind                  | Strategy                                  |
| --------------------------- | ----------------------------------------- |
| `INCREMENTAL_BY_TIME_RANGE` | INSERT OVERWRITE by time column partition |
| `INCREMENTAL_BY_UNIQUE_KEY` | MERGE ON unique key                       |
| `INCREMENTAL_BY_PARTITION`  | REPLACE WHERE by partitioning key         |
| `FULL`                      | INSERT OVERWRITE                          |

***

### Next steps

After configuring Databricks, continue with:

```
Connect to Engine -> Define models and logic -> Validate and test locally
```


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://v2.dataos.info/build/stage-2-productize/connect-to-engine/databricks.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
