> For the complete documentation index, see [llms.txt](https://v2.dataos.info/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://v2.dataos.info/operate/phase-1-provision-data-plane/azure.md).

# Azure

## Azure

Deploy a **DataOS Data Plane** onto an **existing Azure Kubernetes Service (AKS) cluster** in your Azure subscription. Your SRE team provisions and operates the Azure infrastructure using Modern-published Terraform modules. DataOS installs the platform components on top through a dataplane YAML applied with `dataos-ctl`. The Data Plane connects to the DataOS **Control Plane (Instance)** in **Modern Data Cloud** over an outbound **tunnel**.

***

### Deploy the Data Plane

Work through the steps in order. Each step prepares one input the final `dataos-ctl domain apply` needs: a CLI context pointed at your Instance, a base64 kubeconfig for the AKS cluster, the environment variables the YAML resolves, and the dataplane YAML itself.

{% stepper %}
{% step %}

#### Initialize the CLI context

Point `dataos-ctl` at your DataOS Instance. Run:

```bash
dataos-ctl init
```

The CLI walks you through a short interactive prompt. When it asks for the **Tenant Identifier**, enter the default tenant `system` (not your application tenant): provisioning runs against `system`:

```bash
INFO[0000] The DataOS® is not initialized, do you want to proceed with initialization? (Y,n)
->Y

INFO[0005] Please enter a name for the current DataOS® Context?
-><instance-name>

INFO[0012] Please enter the Fully Qualified Domain Name of the DataOS® instance?
->your-org.dataos.app

INFO[0037] Please enter the Tenant Identifier to use for the DataOS® instance?
->system

INFO[0041] entered DataOS®: <instance-name> : your-org.dataos.app : system
INFO[0041] 🚀 initialization...complete
```

Then log in:

```bash
dataos-ctl login
```

{% hint style="info" %}
Enter the FQDN without a protocol prefix: use `your-org.dataos.app`, not `https://your-org.dataos.app`. The Instance domain comes from your Modern Data representative.
{% endhint %}
{% endstep %}

{% step %}

#### Base64-encode the kubeconfig

Obtain the **kubeconfig file** for your AKS cluster from the Cluster handover, confirm `kubectl` reaches the cluster, then base64-encode it for the YAML and your shell:

```bash
kubectl get nodes                                          # confirm access first

export KUBECONFIG_BASE64=$(base64 -w 0 < kubeconfig.yaml)  # Linux
export KUBECONFIG_BASE64=$(base64 -i kubeconfig.yaml | tr -d '\n')   # macOS
```

This value gives the Control Plane access to the AKS API. Treat it as a secret.
{% endstep %}

{% step %}

#### Export the environment variables

`dataos-ctl domain apply` resolves `${...}` references in the YAML from your shell. Export these before applying:

```bash
export DATAPLANE_ID=<instance-name>-dp-01
export INSTANCE_TENANT_ID=<instance-name>
export KUBECONFIG_BASE64=<base64-encoded-kubeconfig>   # from the previous step
```

{% endstep %}

{% step %}

#### Prepare the dataplane YAML

Author the dataplane YAML using the sample below. As you fill it in, a few attributes decide how the Data Plane installs and connects: get these right and the full breakdown in Dataplane configuration covers the rest:

{% code title="azure-byoc-dataplane.yaml" %}

```yaml
name: ${DATAPLANE_ID}
version: v1alpha
entity: domain
type: dataplane
description: the ${DATAPLANE_ID} dataplane for ${INSTANCE_TENANT_ID} instance
v1alpha:
  dataplane:
    name: ${DATAPLANE_ID}${TODAY_DATE}
    tenantDomainNames:
      - system-entities
    networkType: tunnel
    workpiece:
      name: ${DATAPLANE_ID}-${INSTANCE_TENANT_ID}.dataplane
      zone: dataos.cloud
      kernelRequest:
        template: dataplane-kernel-system-existing-compute-tunnel-v1
        cloud: azure
        inputs:
          dataOsManagerConfigs:
            imageTag: latest
            logLevel: info
            replicas: 1
            nodeSelector:
              dataos.io/purpose: core-kernel
          kubeConfigBase64: ${KUBECONFIG_BASE64}
        target: engineering
    dataPlaneKernelSystemInstall:
      installFromGit:
        installFile: install.yaml
        applicationsFile: install.applications.yaml
        valuesFile: install.values.yaml
        installFileRootDir: installs/dataplane-kernel-system
        gitRepoUrl: https://bitbucket.org/rubik_/dataos-component-install.git
        gitBranch: <GIT_BRANCH>
        onDemandCloudKernel: true
        gitUsername: <GIT_USERNAME>
        gitPassword: <GIT_PASSWORD>
      azureEndpointSuffix: core.windows.net
      awsEndpointSuffix: amazonaws.com
      coreKernelNodeSelector:
        dataos.io/purpose: core-kernel
```

{% endcode %}

{% hint style="warning" %}
The kubeconfig inside the YAML is sensitive. Store the dataplane YAML encrypted and limit who can decrypt it.
{% endhint %}
{% endstep %}

{% step %}

#### Apply the Dataplane manifest configuration

With the context set and the variables exported, apply the YAML:

```bash
DATAPLANE_ID=$DATAPLANE_ID INSTANCE_TENANT_ID=$INSTANCE_TENANT_ID KUBECONFIG_BASE64=$KUBECONFIG_BASE64 dataos-ctl domain apply -f azure-byoc-dataplane.yaml
```

The Control Plane validates the request and starts installing the platform onto your cluster.
{% endstep %}
{% endstepper %}

<details>

<summary>What happens after you apply</summary>

When you run the apply, the Control Plane reads the `kubeConfigBase64` you supplied and uses it to reach the AKS API for the first time. It installs `dataos-manager` onto the cluster, which pulls the install manifests defined in `installFromGit` and lays down the DataOS platform components.

Because you set `networkType: tunnel`, the Data Plane does not wait to be reached from the outside. A tunnel client inside the cluster opens an outbound, encrypted connection from a worker node out through the cluster's outbound NAT to Modern Data Cloud. No inbound ports on your cluster are exposed. Once the tunnel is established, the Control Plane registers the Data Plane and manages it over that connection.

The kubeconfig is used only for this Control-Plane-to-cluster management path. It is not part of your data traffic. Every hop is TLS-encrypted end to end.

</details>

***

### Manage the Data Plane

Once the Data Plane is live, run these routine operations against it.

#### Check status

```bash
dataos-ctl domain get
# or
dataos-ctl domain get -a   # if created by others
```

If the command above shows `workpiece-forge-kernel: success`, the infrastructure setup is complete. Then access the Data Plane cluster using `kubectl` and inspect the `dataos-manager` logs.

```bash
kubectl logs -n <instance_name>-0-dsm dataos-manager-0 -f
```

#### Describe the applied configuration

```bash
dataos-ctl domain get -t dataplane -n <instance-name>-dp-01 -d
```

***

### Dataplane configuration

A field-by-field reference for the dataplane manifest configuration.

#### Metadata

Top-level fields that identify the Data Plane resource.

```yaml
name: ${DATAPLANE_ID}
version: v1alpha
entity: domain
type: dataplane
description: the ${DATAPLANE_ID} dataplane for ${INSTANCE_TENANT_ID} instance
```

{% hint style="info" %}
Use a naming convention that clearly encodes the Instance and cloud. Example: `<instance-name>-<cloud>-dp-01`.
{% endhint %}

#### Dataplane

Core Data Plane identity and Control Plane connectivity.

```yaml
v1alpha:
  dataplane:
    name: ${DATAPLANE_ID}${TODAY_DATE}
    tenantDomainNames:
      - system-entities
    networkType: tunnel
```

#### Workpiece

Represents the execution unit for the provisioning pipeline.

```yaml
workpiece:
  name: ${DATAPLANE_ID}-${INSTANCE_TENANT_ID}.dataplane
  zone: dataos.cloud
```

#### Kernel Request

Selects the kernel provisioning blueprint and target cloud.

```yaml
kernelRequest:
  template: dataplane-kernel-system-existing-compute-tunnel-v1
  cloud: azure
```

#### Inputs (DataOS Manager + Kubeconfig)

```yaml
inputs:
  dataOsManagerConfigs:
    imageTag: latest
    logLevel: info
    replicas: 1
    nodeSelector:
      dataos.io/purpose: core-kernel
  kubeConfigBase64: ${KUBECONFIG_BASE64}
```

#### Config Bundle

```yaml
target: engineering
```

#### Installation (Git-based)

Defines which installation manifests are applied for Data Plane components and the source repository for installation artifacts.

```yaml
installFromGit:
  installFile: install.yaml
  applicationsFile: install.applications.yaml
  valuesFile: install.values.yaml
  installFileRootDir: installs/dataplane-kernel-system
  gitRepoUrl: https://bitbucket.org/rubik_/dataos-component-install.git
  gitBranch: <GIT_BRANCH>
  onDemandCloudKernel: true
  gitUsername: <GIT_USERNAME>
  gitPassword: <GIT_PASSWORD>
```

#### Cloud Endpoints

Used for cloud integrations and endpoint resolution.

```yaml
azureEndpointSuffix: core.windows.net
awsEndpointSuffix: amazonaws.com
```

#### Node Selectors

Ensures platform components schedule onto the intended node pool.

```yaml
coreKernelNodeSelector:
  dataos.io/purpose: core-kernel
```

Do not change this. The DataOS installer applies the same node label to node pools.

***

### Troubleshooting

<details>

<summary>Apply fails with an auth error</summary>

Invalid or expired kubeconfig, or insufficient RBAC. Regenerate the kubeconfig with the required RBAC, re-encode it, and re-apply.

</details>

<details>

<summary>Pods stuck <code>Pending</code></summary>

The `dataos.io/purpose: core-kernel` label is missing, or the node pool lacks capacity. Label the nodes and scale the node pool.

</details>

<details>

<summary>Image pull failures</summary>

Egress to the registry is blocked, or image pull secrets are missing for a private mirror. Allow TCP 443 to the registry and configure image pull secrets.

</details>

<details>

<summary>Tunnel does not establish</summary>

Egress to `*.cloudflare.com:443` or `cfargotunnel.com:443` is blocked, or DNS resolution is failing. Confirm outbound NAT egress and DNS resolution from a worker node. The tunnel client retries automatically once egress is restored.

</details>

<details>

<summary>Reconciliation stalls (cluster unreachable)</summary>

The AKS API is unreachable, or the kubeconfig identity lacks Kubernetes RBAC. Verify the kubeconfig's identity is bound to a Kubernetes RBAC role on the cluster.

</details>

<details>

<summary>Install step fails</summary>

Git credentials or `installFromGit` paths are invalid. Validate Git access from a worker node.

</details>

***

### Best practices

* Dedicate the `core-kernel` node pool to platform components. Do not co-schedule user workloads on it.
* Monitor Azure quotas and AKS node-pool capacity ahead of production workload growth.
* Capture provisioning logs from `dataos-ctl` and Azure Monitor during apply and upgrade operations for assertion and troubleshooting.
* Keep YAML encrypted before pushing it to the Git repository.
* Use decrypted YAML only at apply time.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://v2.dataos.info/operate/phase-1-provision-data-plane/azure.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
