> For the complete documentation index, see [llms.txt](https://v2.dataos.info/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://v2.dataos.info/concepts/foundations/activation/ai-activation.md).

# AI activation

AI activation is the DataOS pattern for making governed Data Products usable through AI assistants and agentic applications. It lets you ask questions, inspect Data Products, and speed up Data Product design from the assistants you already use. DataOS continues to enforce the same contracts, authorization, semantic definitions, quality rules, lineage, and ownership metadata.

<figure><img src="/files/OwDFVkCsqvzf8x62gudB" alt=""><figcaption></figcaption></figure>

AI activation is delivered through Data Product MCP. Data Product MCP is the MCP server where Data Product tools are hosted so AI assistants can build, discover, inspect, and consume governed Data Products. From a Data Product, select **Activate → MCP** to open the **Connect with MCP** page, which lists the supported AI clients and agentic frameworks.

<figure><img src="/files/PdTZ9VlsJePl7H8ffUDn" alt=""><figcaption><p>The Connect with MCP page lists supported AI clients with install guides.</p></figcaption></figure>

The consume section explains the hands-on path: how to connect a client, configure an MCP endpoint, and ask questions. This page explains the concept behind that experience. It covers how Data Product MCP works, how MCP clients connect to MCP servers, and what trust boundaries remain in place when an assistant becomes the user interface.

## What AI activation solves

Many users need answers from governed data but do not work in SQL, dashboards, or platform consoles every day. AI activation gives them a natural-language entry point to Data Products without creating a separate, ungoverned access path.

It also helps builders move faster. A Data Product owner or analytics engineer can use the same AI surface to explore the catalog, reason through a design, scaffold a project, and validate generated artifacts against DataOS conventions.

AI activation supports two primary journeys:

| Journey                         | User                                                 | Outcome                                                                                |
| ------------------------------- | ---------------------------------------------------- | -------------------------------------------------------------------------------------- |
| Consume governed data           | Business users, analysts, managers, operations teams | Ask questions, find Data Products, inspect trust signals, and identify owners.         |
| Build and operate Data Products | Data Product owners, analytics engineers, developers | Design, scaffold, validate, and operate Data Products with assistant-guided workflows. |

## What Data Product MCP provides

Data Product MCP gives an AI assistant a governed interface to Data Products. Instead of granting direct database access, Data Product MCP exposes named MCP tools with defined inputs, outputs, and boundaries.

For example, when a user asks `What was net revenue by customer segment last quarter?`, the assistant can call a governed query tool instead of writing raw SQL. The tool validates the request against the Data Product semantic surface, applies the caller's DataOS access, and returns a structured response the assistant can explain.

Data Product MCP is useful because it keeps three responsibilities separate:

<table><thead><tr><th width="208.342529296875">Responsibility</th><th>Where it belongs</th></tr></thead><tbody><tr><td>User experience</td><td>The AI host, such as Cursor, Claude, Copilot in VS Code, Codex Desktop, or an agentic framework.</td></tr><tr><td>Tool connection and session</td><td>The MCP client inside the host.</td></tr><tr><td>Governed Data Product work</td><td>The Data Product MCP server and the DataOS Data Product layer.</td></tr></tbody></table>

The assistant is not the authority on the data. It is the interaction surface. Data Product MCP is the controlled bridge. The Data Product remains the system of record for semantics, contracts, policies, quality, lineage, ownership, and runtime access.

## MCP components

MCP uses a host-client-server model. In AI activation, these components let a model use Data Product capabilities safely.

<table><thead><tr><th width="135.1107177734375">Component</th><th>Meaning in MCP</th><th>Example in AI activation</th></tr></thead><tbody><tr><td>Host</td><td>The application that provides the user experience and contains the model.</td><td>Cursor, Claude, Copilot in VS Code, Codex Desktop, or a custom agentic application.</td></tr><tr><td>Client</td><td>The protocol connector created by the host for a specific MCP server.</td><td>The MCP connection configured inside the assistant for Data Product MCP.</td></tr><tr><td>Server</td><td>The endpoint that exposes MCP capabilities.</td><td>Data Product MCP hosts tools for building, discovering, querying, and inspecting Data Products.</td></tr><tr><td>Tools</td><td>Callable operations the model can request through the client.</td><td><code>search</code>, <code>vulcan_about</code>, <code>vulcan_query</code>, <code>vulcan_quality</code>, <code>lineage</code>, and <code>table_profile</code>.</td></tr><tr><td>Resources</td><td>Read-only context a server can expose when supported by the implementation.</td><td>Documentation, metadata, examples, schema context, or other reference material.</td></tr><tr><td>Prompts</td><td>Reusable prompt templates a server can expose when supported by the implementation.</td><td>Guided prompts for investigation, metric validation, or builder workflows.</td></tr></tbody></table>

A host needs clients because a single host can connect to many MCP servers. Each client manages one server connection, negotiates capabilities with that server, lists the available tools or resources, and sends tool calls when the model decides external context or action is needed.

For example, a host may have one MCP client connected to Data Product MCP and another MCP client connected to a source-code or ticketing MCP server. The host decides which client to use based on the user's prompt and the model's plan.

## MCP layers

MCP separates the meaning of messages from how those messages are transported.

<table><thead><tr><th width="140.7657470703125">Layer</th><th>What it handles</th><th>Example</th></tr></thead><tbody><tr><td>Data layer</td><td>The JSON-RPC message model, lifecycle, capability negotiation, requests, responses, notifications, errors, tools, resources, and prompts.</td><td>The client sends <code>tools/call</code> with arguments for <code>vulcan_query</code>. The server returns a structured result or error.</td></tr><tr><td>Transport layer</td><td>The communication channel that carries MCP messages between the client and server, such as local <code>stdio</code> or HTTP-based transports.</td><td>A desktop assistant may connect to a local MCP process. A hosted assistant may connect to Data Product MCP over HTTP.</td></tr></tbody></table>

The data layer defines what the client and server are saying. The transport layer defines how the messages move between them.

## Data Product MCP lifecycle

When an AI host adds Data Product MCP, the MCP client and the server follow a three-phase lifecycle. Each phase has a distinct role: negotiation before any work begins, active tool use during the session, and a clean close when the session ends.

```mermaid
%%{init: {"theme":"base","themeVariables":{"fontFamily":"PP Neue Montreal, Inter, Helvetica Neue, Arial, sans-serif","fontSize":"14px","primaryColor":"#EDE9E5","primaryTextColor":"#242422","primaryBorderColor":"#242422","lineColor":"#242422","secondaryColor":"#D6CDC6","tertiaryColor":"#FFFFFF","clusterBkg":"#EDE9E5","clusterBorder":"#54DED1","edgeLabelBackground":"#FFFFFF"},"flowchart":{"curve":"basis","padding":12,"nodeSpacing":40,"rankSpacing":50}}}%%
flowchart LR
    A(["🔌 Host adds\nData Product MCP"])
    B["**Initialization**\nVersion & capability\nnegotiation"]
    C["**Operation**\nTool calls, queries,\ndiscovery, inspection"]
    D(["✅ Session ends\nclean transport close"])

    A --> B
    B -->|"Capabilities agreed"| C
    C -->|"Session complete"| D

    classDef primary-teal fill:#54DED1,color:#202F36,stroke:#009293,stroke-width:1.5px,font-weight:600;
    classDef dark-teal    fill:#009293,color:#FFFFFF,stroke:#242422,stroke-width:1.5px,font-weight:600;
    classDef cream        fill:#EDE9E5,color:#242422,stroke:#242422,stroke-width:1px;
    classDef sandpaper    fill:#D6CDC6,color:#242422,stroke:#242422,stroke-width:1px;

    class A cream;
    class B sandpaper;
    class C dark-teal;
    class D primary-teal;
```

### Initialization

Initialization is the negotiation phase that runs once at the start of every session. The MCP client sends its supported protocol version and capabilities to Data Product MCP. The server responds with its own version, server details, and the capabilities it supports: including the tool catalog. The client reads that catalog and makes the available tools visible to the model. No tool calls happen during initialization; both sides are only agreeing on what the session is allowed to do. If the client and server cannot agree on a compatible protocol version, the session does not proceed.

### Operation

Operation is the active phase where all Data Product MCP work happens. The model receives the user's prompt and the tool descriptions negotiated during initialization. When the model decides that a Data Product tool is needed: for discovery, trust inspection, semantic querying, lineage, quality, or profiling: it requests the tool call through the MCP client. The client sends a structured JSON-RPC request to Data Product MCP, which validates the caller's identity, applies DataOS authorization and policies, calls the appropriate Data Product API, and returns a structured response. The assistant uses the response to explain, answer, or continue the task. A single user question may trigger multiple sequential tool calls within one operation phase: for example, calling `search` to locate a Data Product, then `vulcan_quality` to check its health, then `vulcan_query` to retrieve a governed metric.

### Shutdown

Shutdown is the close phase that ends the session. MCP does not require a dedicated shutdown message; the connection ends when the underlying transport closes. For HTTP-based connections to Data Product MCP, this happens when the HTTP connection is released: for example, when the assistant session ends, the user closes the tool, or the agentic workflow completes and releases its MCP client. When the session closes, no further tool calls can be made. A new session requires a fresh initialization phase before operation can resume.

## Request flow

AI activation starts with the host because the host is the user experience. The user asks a question in the host, the model decides whether it needs external context, and the MCP client calls Data Product MCP when a Data Product tool is needed.

The flow is:

{% stepper %}
{% step %}
The user asks a question in an AI host.
{% endstep %}

{% step %}
The host sends the prompt to the model with the available MCP tool descriptions.
{% endstep %}

{% step %}
The model decides whether a Data Product MCP tool is needed.
{% endstep %}

{% step %}
The host asks the MCP client to call the selected tool with structured arguments.
{% endstep %}

{% step %}
The MCP client sends the request to the Data Product MCP server.
{% endstep %}

{% step %}
Data Product MCP validates the request and calls the appropriate DataOS or Data Product APIs.
{% endstep %}

{% step %}
DataOS applies identity, tenant, policy, masking, semantic, quality, lineage, and contract checks.
{% endstep %}

{% step %}
Data Product MCP returns a structured response with data, metadata, warnings, citations, or errors.
{% endstep %}

{% step %}
The host presents the model's explanation to the user, grounded in the tool response.
{% endstep %}
{% endstepper %}

For example, a user may ask `Is orders360 fresh enough to use for today's revenue review?` The model can call `vulcan_runs` to inspect recent runs, `vulcan_quality` to inspect quality status, and `vulcan_about` to identify the owner if something needs follow-up.

## Authentication and authorization

MCP defines how clients and servers exchange messages. Authentication and authorization are enforced by the MCP server implementation and the transport configuration used to reach it.

For Data Product MCP, every request is evaluated using the caller's DataOS identity. If a user cannot access a Data Product, table, column, or value in DataOS, the assistant cannot access it through Data Product MCP. If a value is masked for the user in DataOS, the assistant receives the masked value.

This means the host can improve the conversation around the answer, but Data Product MCP still controls what the assistant can retrieve. The server authenticates the caller, authorizes the requested operation, applies Data Product policies, and returns only the response the caller is allowed to receive.

## Data Product MCP tools

Data Product MCP exposes 18 tools that help the assistant discover, trust, query, build, and operate Data Products.

| Group                  | Tool                     | What it does                                                                                                      |
| ---------------------- | ------------------------ | ----------------------------------------------------------------------------------------------------------------- |
| Discovery and metadata | `search`                 | Searches Data Products, tables, columns, owners, metrics, lifecycle stage, PII, and related catalog context.      |
| Discovery and metadata | `vulcan_about`           | Returns product metadata such as description, domain, tags, terms, use cases, limitations, owner, and readme.     |
| Querying and schema    | `vulcan_query`           | Executes a semantic query with measures, dimensions, filters, time dimensions, and pagination.                    |
| Querying and schema    | `vulcan_semantic_schema` | Returns the queryable semantic layer: measures, dimensions, time dimensions, segments, and metrics.               |
| Observability          | `vulcan_quality`         | Assesses data quality with status, check-level drill-down, diagnostics, and run history.                          |
| Observability          | `vulcan_runs`            | Checks run history, last run status, duration, per-model metrics, schedule, overdue flags, and diagnostics.       |
| Observability          | `table_profile`          | Returns table profile statistics such as row count, freshness, null rates, distributions, min/max, and quartiles. |
| Observability          | `lineage`                | Traces lineage at table, column, or Data Product level with upstream, downstream, and impact context.             |
| Design and build       | `design_data_product`    | Guides the design of a new Data Product from a plain-language use case.                                           |
| Design and build       | `build_data_product`     | Guides implementation of an approved design spec into a working Data Product project.                             |
| Design and build       | `advise_design`          | Generates or refines a design spec with entities, grain, measures, metrics, and dimensions.                       |
| Design and build       | `scaffold_generator`     | Generates a project file manifest for seeds, models, semantics, checks, and related files.                        |
| Design and build       | `get_component_template` | Returns Vulcan syntax templates and placement guidance for component types.                                       |
| Design and build       | `retrieve_examples`      | Fetches working Vulcan code examples by file category and SQL engine.                                             |
| Design and build       | `enrich_metadata`        | Adds column descriptions, PII labels, glossary terms, and governance tags into project files.                     |
| Design and build       | `review_code`            | Reviews Vulcan SQL or YAML for errors and returns corrected code.                                                 |
| Design and build       | `suggest_quality_checks` | Generates DQ check rules and `MODEL()` assertions for a specific model.                                           |
| Design and build       | `explain_concept`        | Explains Vulcan or DataOS concepts in plain language with examples.                                               |

These tools keep the assistant on the Data Product surface. If a question cannot be answered by the available semantic surface, Data Product MCP returns a limitation or validation error instead of fabricating a metric.

## Consumer journey

For a data consumer, AI activation follows four stages.

<table><thead><tr><th width="110.859619140625">Stage</th><th>Example prompt</th><th>What Data Product MCP returns</th></tr></thead><tbody><tr><td>Find</td><td><code>What Data Products do we have for supplier performance?</code></td><td>Relevant Data Products or tables from the catalog, scoped to what the user can access.</td></tr><tr><td>Trust</td><td><code>Is this data fresh enough to act on?</code></td><td>Semantic fields, quality status, run history, lineage, table profile, documented limitations, and owner information.</td></tr><tr><td>Ask</td><td><code>What was quarterly revenue by customer segment?</code></td><td>A governed answer from the Data Product semantic layer, including the metric or measure used and the source model.</td></tr><tr><td>Act</td><td><code>Why did this query fail, and who owns the Data Product?</code></td><td>A clear failure reason and owner context so the user can fix the question, contact the owner, or escalate.</td></tr></tbody></table>

Manual consumption starts with a destination. The user searches the Hub for a Data Product, opens it to explore assets and quality, and queries models in Workbench or Studio. The user then consumes through a BI tool, API, or data application. Each step assumes the user already knows which product they need.

AI activation starts with intent instead. The user can ask a question before knowing which Data Product, metric, owner, table, or interface is relevant.

## Builder journey

For a builder or owner, AI activation extends beyond consumption. The assistant can guide the user through a structured Data Product lifecycle.

<table><thead><tr><th width="124.41070556640625">Stage</th><th>What the assistant helps with</th></tr></thead><tbody><tr><td>Design</td><td>Clarifies the business problem, consumers, entities, grain, measures, dimensions, freshness expectations, and quality contract.</td></tr><tr><td>Build</td><td>Generates a project scaffold, applies syntax templates, and reviews generated SQL or YAML before handoff.</td></tr><tr><td>Validate</td><td>Checks syntax, quality rules, tests, metadata, and assumptions before the Data Product is promoted.</td></tr><tr><td>Operate</td><td>Uses runtime Data Product MCP tools to monitor runs, inspect quality failures, check lineage impact, profile tables, and answer questions.</td></tr></tbody></table>

The builder workflow is grounded in the approved design spec and retrieved DataOS examples. The assistant should mark assumptions, surface TODOs, and ask for confirmation at workflow checkpoints instead of silently filling gaps.

## Trust model

AI activation has two trust layers.

The first layer is governed by the platform. Data Product MCP tools return structured responses from the Data Product APIs. Authorization, filtering, masking, semantic validation, quality retrieval, lineage retrieval, and owner lookup are enforced before the response reaches the assistant.

The second layer is assistant-shaped. The assistant chooses Data Product MCP tools, passes arguments, and explains results. Data Product MCP reduces risk by giving the assistant tool descriptions, strict parameters, structured response fields, warnings, and citations, but the assistant still controls the final wording.

For this reason, AI activation should be understood as governed assistant access, not autonomous data authority. The Data Product remains the source of truth; the assistant is the interface that helps users reach it.

## Boundaries

AI activation is not a bypass around DataOS controls and is not a general-purpose database connection.

Data Product MCP does not:

* Expose raw SQL access for arbitrary querying.
* Return data the user is not authorized to see.
* Invent missing measures, owners, quality results, or lineage.
* Modify, retrigger, or remediate runtime jobs through the consumption Data Product MCP tools.
* Remove the need for review when AI assists with building a Data Product.

For generated Data Product artifacts, users should review semantic correctness, quality rules, data contracts, and environment-specific configuration before deployment.

## Related topics

* [Consume with AI](/consume/consume-with-ai/overview.md)
* [Connect AI clients](/consume/consume-with-ai/connect-clients.md)
* [Connect agentic frameworks](/consume/consume-with-ai/connect-agentic-frameworks.md)
* [Build journey with AI](/build/get-started/build-journey-with-ai.md)
* [Build with AI assistance](https://app.gitbook.com/s/bCA0p0BwgdQAP5IwnaO4/stage-2-productize/build-with-ai-assistance)
* [Runtime MCP tools](https://app.gitbook.com/s/cL04JUTJPL73kRrjaBSa/consume-with-ai/runtime-mcp-tools)
* [AI cookbook](/consume/consume-with-ai/cookbook.md)


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://v2.dataos.info/concepts/foundations/activation/ai-activation.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
