> For the complete documentation index, see [llms.txt](https://v2.dataos.info/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://v2.dataos.info/build/choose-your-engine/engine-cheat-sheet.md).

# Engine capabilities

Your engine choice affects data access, transformation behavior, SQL dialect, and operational ownership. Choose carefully: the wrong engine adds friction, while the right one makes the rest of the work simpler.

The engine sets:

* **Data flow** - Where source data is read and where outputs land.
* **Syntax** - Which SQL dialect or runtime (Python, Spark) to be used.
* **Workload fit** - SQL-first, federated, or Spark-heavy.
* **Materialization** - Full refresh, incremental, Iceberg.
* **Security** - The permissions, roles, and credentials needed at runtime.
* **Operational ownership** - Who watches cost, performance, and failures.

## Engines at a glance

### Snowflake

Use this for governed, SQL-first analytics on data that already lives in Snowflake.

* **Skip it when:** You need open lakehouse storage or Spark-heavy engineering.
* **Ops effort:** Medium.
* **Verify first:** Warehouse, database, role, and permissions.

See the [Snowflake engine technical manual](/concepts/resources/vulcan/technical-manuals/snowflake-engine.md) for configuration, limits, and troubleshooting.

***

### Databricks

Use this when your workflows already run on Delta Lake, Unity Catalog, and Spark.

* **Skip it when:** The workload is simple warehouse SQL.
* **Ops effort:** Medium to high.
* **Verify first:** Workspace, HTTP path, catalog, and token.

See the [Databricks engine technical manual](/concepts/resources/vulcan/technical-manuals/databricks-engine.md) for configuration, limits, and troubleshooting.

***

### Postgres

Use this for smaller products, prototypes, or local-first workflows.

* **Skip it when:** You need large-scale analytical processing.
* **Ops effort:** Low to medium.
* **Verify first:** Host, database, user, and permissions.

See the [Postgres engine technical manual](/concepts/resources/vulcan/technical-manuals/postgres-engine.md) for configuration, limits, and troubleshooting.

***

### Trino

Use this when you need federated SQL across multiple catalogs and systems.

* **Skip it when:** You want it to behave like a storage layer.
* **Ops effort:** Medium to high.
* **Verify first:** Catalogs, connectors, source permissions, and cluster.

See the [Trino engine technical manual](/concepts/resources/vulcan/technical-manuals/trino-engine.md) for configuration, limits, and troubleshooting.

***

### Spark

Use this for Iceberg, object storage, PySpark, or large distributed processing.

* **Skip it when:** The workload is small and mostly SQL.
* **Ops effort:** High.
* **Verify first:** Spark compute, Lakehouse resource, Depot, and sizing.

See the [Spark engine technical manual](/concepts/resources/vulcan/technical-manuals/spark-engine.md) for configuration, limits, and troubleshooting.

***

## Before you start

Run through this list once before you choose an engine:

1. **Data location:** Confirm where the source data lives. Reuse the existing platform to avoid moving data unnecessarily.
2. **Permissions:** Make sure you can read sources, create schemas and tables, write output, and run queries.
3. **Materialization:** Confirm the engine supports the strategy you need: full refresh, incremental by time, key, or partition; or CDC.
4. **Operational ownership:** Decide who watches cost, performance, and failures. Spark and Trino need more active tuning than Snowflake or Postgres.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://v2.dataos.info/build/choose-your-engine/engine-cheat-sheet.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
