> For the complete documentation index, see [llms.txt](https://v2.dataos.info/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://v2.dataos.info/concepts/resources/nilus/batch/batch-sources/ms-sql-server.md).

# MS SQL Server

[Microsoft SQL Server](https://learn.microsoft.com/sql/sql-server/) is a relational database engine. Nilus reads from SQL Server as a **batch source** through the ODBC-backed SQL path. The connector supports both standard SQL Authentication (username + password) and Azure AD token-based authentication.

For row-level change capture from SQL Server, use the [MS SQL Server (CDC)](/concepts/resources/nilus/cdc/cdc-sources/ms-sql-server.md) instead.

## Requirements

Connectivity and credentials must both be in place before the pipeline can run.

### Connectivity

* The Nilus runtime must reach the SQL Server endpoint on the configured port (default `1433`).
* The runtime image must include a working ODBC driver. Nilus expects `ODBC Driver 18 for SQL Server` by default; older drivers can be selected via the `driver` URI parameter.
* For Azure SQL Database / Managed Instance, allowlist the runtime egress IP on the server-level firewall and confirm the database is online (paused serverless tiers reject connections).

### Required parameters

| Parameter                | Required    | Default                         | Description                                                                                                                                         |
| ------------------------ | ----------- | ------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------- |
| `host`                   | Yes         | -                               | SQL Server host name.                                                                                                                               |
| `port`                   | No          | `1433`                          | SQL Server port.                                                                                                                                    |
| `database`               | Yes         | -                               | Source database name.                                                                                                                               |
| `username`               | Conditional | -                               | SQL login for password authentication. Omit when using Azure AD token authentication.                                                               |
| `password`               | Conditional | -                               | Password for password authentication. When using Azure AD, this carries the access token (the driver receives it via `MSSQL_COPT_SS_ACCESS_TOKEN`). |
| `driver`                 | No          | `ODBC Driver 18 for SQL Server` | ODBC driver name. Must match a driver actually installed in the runtime image.                                                                      |
| `Encrypt`                | No          | driver default                  | `yes` to require an encrypted connection (default for Driver 18).                                                                                   |
| `TrustServerCertificate` | No          | driver default                  | `yes` to skip TLS certificate validation. Required when the server uses a self-signed certificate (common in dev environments).                     |
| `Authentication`         | Conditional | -                               | Set to `ActiveDirectoryAccessToken` for Azure AD token-based authentication.                                                                        |
| `connect_timeout`        | No          | `30`                            | Driver connection timeout in seconds.                                                                                                               |

### Database-side permissions

```sql
GRANT CONNECT ON DATABASE :: [<database_name>] TO [<username>];
GRANT SELECT ON SCHEMA :: [<schema_name>] TO [<username>];
-- or, for table-scoped access:
GRANT SELECT ON [<schema_name>].[<table_name>] TO [<username>];
```

### URI format

Password authentication:

```
mssql://<username>:<password>@<host>:<port>/<database>?driver=ODBC+Driver+18+for+SQL+Server&TrustServerCertificate=yes
```

Azure AD token authentication:

```
mssql://:<access-token>@<host>:<port>/<database>?driver=ODBC+Driver+18+for+SQL+Server&Authentication=ActiveDirectoryAccessToken
```

Depot-backed equivalent (recommended for production):

```
dataos://<mssql-depot>
```

## Source options

| Option                 | Required | Description                                                                                                                                                                                             |
| ---------------------- | -------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `source_table`         | Yes      | Fully qualified table name in `<schema>.<table>` form (typically `dbo.<table>` in default SQL Server installations).                                                                                    |
| `incremental_key`      | No       | Timestamp or numeric column used to identify newly visible rows for each run. SQL Server's `LastModifiedDate`, `rowversion` (`SYSDATETIME()` style triggers), or sequence-backed `id` are common picks. |
| `interval_start`       | No       | Optional ISO-8601 lower bound for the extraction window.                                                                                                                                                |
| `interval_end`         | No       | Optional ISO-8601 upper bound for the extraction window.                                                                                                                                                |
| `page_size`            | No       | Rows per extraction batch (default `50000`).                                                                                                                                                            |
| `sql_reflection_level` | No       | `full` (default) or reduced, controls how thoroughly Nilus reflects the source schema before extraction.                                                                                                |
| `sql_limit`            | No       | Caps total rows extracted per run. Use for sampling and validation.                                                                                                                                     |
| `sql_exclude_columns`  | No       | Comma-separated column names to skip during extraction.                                                                                                                                                 |
| `type_hints`           | No       | Object map of `column_name: <type>` to override inferred types. Supported types: `text`, `bigint`, `bool`, `timestamp`, `date`, `decimal`, `double`, `binary`, `json`, `time`.                          |
| `max_table_nesting`    | No       | String. `"0"` for fully flattened analytics-friendly output (default for SQL sources).                                                                                                                  |

> **Note** There is **no** separate `schema` option. The schema is inferred from the `<schema>.<table>` form of `source_table`.

## Sample Nilus configs

Each example below is self-contained and uses the current Nilus pipeline shape.

### Batch, incremental ingestion to Lakehouse

```yaml
name: nilus-mssql-batch
version: v1alpha
type: nilus
description: SQL Server → DataOS Lakehouse incremental snapshot
spec:
  type: batch
  compute: runnable-default
  source:
    address: dataos://mssql-depot
    options:
      source_table: dbo.orders
      incremental_key: LastModifiedDate
  sink:
    address: dataos://analytics-lakehouse
    options:
      dest_table: sales.orders_raw
      incremental_strategy: merge
      aws_region: us-west-2
```

### Batch, Azure AD token authentication

```yaml
spec:
  type: batch
  source:
    address: mssql://:{AAD_ACCESS_TOKEN}@sql.example.com:1433/sales?driver=ODBC+Driver+18+for+SQL+Server&Authentication=ActiveDirectoryAccessToken
    options:
      source_table: dbo.orders
      incremental_key: LastModifiedDate
  sink:
    address: dataos://analytics-lakehouse
    options:
      dest_table: sales.orders_raw
      incremental_strategy: merge
```

## Behavior and capabilities

* **Compute model**: Nilus connects through ODBC + the `pyodbc` driver and submits parameterized SQL reads.
* **Object model**: standard SQL Server tables and views addressed as `<schema>.<table>`. The default schema in most installations is `dbo`.
* **Pipeline mode**: `batch` only on this page; for row-level change capture see the [MS SQL Server (CDC)](/concepts/resources/nilus/cdc/cdc-sources/ms-sql-server.md).
* **Authentication modes**: SQL Authentication (default) and Azure AD token-based authentication. For AAD, the source uses ODBC's `MSSQL_COPT_SS_ACCESS_TOKEN` connection attribute and skips the `UID` / `PWD` / `AUTHENTICATION` keywords from the credentials block.
* **TLS posture**: Driver 18 enforces encrypted connections by default. Set `TrustServerCertificate=yes` only for development scenarios; in production, install the server certificate's issuing CA into the runtime's trust store and leave the parameter unset (or `no`).
* **Custom queries**: supply `source_table: "query:SELECT ... FROM ..."` to extract from a hand-authored SQL query instead of a single table.

## Troubleshooting

| Symptom                                                                            | Likely cause                                                                                                        | Resolution                                                                                                                  |
| ---------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------- |
| `IM002 Data source name not found and no default driver specified`                 | The ODBC driver named in the URI is not installed in the runtime.                                                   | Install `ODBC Driver 18 for SQL Server` in the runtime image, or set `driver=` to a driver that is installed.               |
| `SSL Provider: error code 0xfff…` / certificate-trust failure                      | Driver 18 enforces TLS but cannot validate the server certificate.                                                  | Add `TrustServerCertificate=yes` to the URI for dev, or install the issuing CA into the runtime trust store for production. |
| `Login failed for user`                                                            | Wrong credentials, expired AAD token, or the user lacks `CONNECT` on the database.                                  | Re-check credentials; for AAD, refresh the token; grant `CONNECT` if missing.                                               |
| `Cannot find the object … because it does not exist or you do not have permission` | `source_table` is missing the schema prefix and the user's default schema is not `dbo`.                             | Use the explicit `<schema>.<table>` form (typically `dbo.<table>`).                                                         |
| AAD token authentication unexpectedly silently fails                               | `Authentication=ActiveDirectoryAccessToken` was not set, so the driver tried `UID`/`PWD` against an empty username. | Add `Authentication=ActiveDirectoryAccessToken` to the URI query and pass the token as the URI password.                    |
| Run extracts the full table every time despite `incremental_key`                   | The column is not monotonically increasing or the pipeline state was reset between runs.                            | Pick a column that is genuinely monotonic; `rowversion` is a strong choice on SQL Server.                                   |
| Connection drops on long extracts                                                  | `connect_timeout` and the driver's command timeout differ; long queries may exceed the latter.                      | Lower `page_size` so each batch query is shorter, or have your DBA raise the relevant server-side timeouts.                 |

## Related docs

* [MS SQL Server (CDC)](/concepts/resources/nilus/cdc/cdc-sources/ms-sql-server.md): companion CDC connector for row-level change capture.
* [MS SQL Server](/concepts/resources/nilus/destinations/databases/ms-sql-server.md): companion destination connector.
* [Optimize Sink Datasets](/concepts/resources/nilus/pipeline-optimization/optimize-sink-datasets.md): guidance on `incremental_strategy`, `partition_by`, and other shape settings.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://v2.dataos.info/concepts/resources/nilus/batch/batch-sources/ms-sql-server.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
