> For the complete documentation index, see [llms.txt](https://v2.dataos.info/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://v2.dataos.info/build/stage-1-discover/inspect-metadata/explore-metadata.md).

# Explore metadata

Use this page when metadata is already available in DataOS to explore dataset metadata during the Discovery stage of the Data Product build journey.

***

Inspect metadata when you need confidence in the meaning, origin, and reliability of a dataset.

Before creating a new Data Product, inspect the datasets that already exist in your tenant. Dataset metadata tells you what data is available, where it lives, how it relates to other assets, and whether it suits your use case.

The goal is simple: **understand the dataset before you reuse or productize it**.

Exploring dataset metadata lets you:

* Find available datasets across connected sources.
* Understand dataset ownership, tags, and source location.
* Review columns, descriptions, and data types.
* Check whether a dataset has already been productized.
* Inspect table-level and column-level lineage.
* Review recent query activity.
* Decide whether the dataset can be reused, explored further, or productized into a Data Product.

***

## UI navigation

Follow this navigation path in DataOS:

```
DataOS Home -> Datasets -> Select source -> Search or filter dataset -> Open dataset
```

Once you open a dataset, use the dataset detail tabs:

```
Overview -> Lineage -> Queries
```

{% tabs %}
{% tab title="Overview" %}
Review columns, tags, descriptions, version, update time, and productization status.
{% endtab %}

{% tab title="Lineage" %}
Understand upstream and downstream flow at table or column level.
{% endtab %}

{% tab title="Queries" %}
Review recent query activity for the dataset.
{% endtab %}
{% endtabs %}

***

{% stepper %}
{% step %}

## Open Datasets

From the left navigation, open **Datasets**.

Use the Datasets page to discover, access, and productize raw data from one place. It shows connected sources such as:

* Snowflake
* PostgreSQL
* Iceberg
* Custom databases
* Other connected systems

Use the source tree on the left to browse by platform or service.

<figure><img src="/files/22BqUktNESE08XjCNX7H" alt=""><figcaption></figcaption></figure>
{% endstep %}

{% step %}

## Select the source

In the left source panel, select the system where the dataset lives.

For example:

```
Datasets -> Snowflake
```

After selecting a source, the dataset list updates to show datasets from that source.
{% endstep %}

{% step %}

## Search or filter datasets

Use the search bar to find datasets by name, schema, or related keywords. You can hide or show datasets that are already productized as Data Products.

In the dataset list, review:

* Dataset name, source path, tags, owner, productized indicator.

Use these signals to decide which dataset is worth opening.

<figure><img src="/files/mOvelqP2eJPouBOKQywR" alt=""><figcaption></figcaption></figure>
{% endstep %}

{% step %}

## Open the dataset

Select a dataset from the list.

The dataset detail page shows the dataset name, source path, and productization status.

Example:

```
CUSTOMER
Productized into 1 data product
```

The productization badge tells you whether the dataset is already used in a Data Product. If it is already productized, inspect the related Data Product before creating a new one.
{% endstep %}

{% step %}

## Explore the Overview tab

Use the **Overview** tab to inspect the dataset structure.

Example checks:

| What to check       | Why it matters                                                   |
| ------------------- | ---------------------------------------------------------------- |
| Column names        | Confirms whether required fields are present.                    |
| Column descriptions | Helps understand business meaning.                               |
| Tags                | Indicates business role, sensitivity, domain, or classification. |
| Last updated        | Helps assess whether metadata is current.                        |
| Version             | Helps track metadata or dataset changes.                         |
| Productized badge   | Shows whether the dataset already powers a Data Product.         |

Ask:

```
Does this dataset contain the fields I need?
```

For example, if you are inspecting a `CUSTOMER` dataset, columns such as `CUSTKEY`, `NAME`, `ADDRESS`, and `ACCTBAL` help you understand whether it can support customer analytics or downstream modeling.
{% endstep %}

{% step %}

## Review column-level metadata

Within the Overview tab, inspect individual columns. Use column metadata to answer:

```
Can I safely use this column in a model, metric, or Data Product contract?
```

For example:

| Column signal         | Builder meaning                                         |
| --------------------- | ------------------------------------------------------- |
| Primary key indicator | Can be used as an identifier or join key.               |
| Description           | Explains the field meaning.                             |
| Tags                  | Shows classification, sensitivity, or business context. |
| Data type             | Helps plan transformations and validation logic.        |
| Lineage icon          | Indicates that column-level lineage can be inspected.   |
| {% endstep %}         |                                                         |

{% step %}

## Explore Lineage

Open the **Lineage** tab to see how the dataset connects to other assets.

The lineage view lets you inspect:

* Source systems
* Base tables
* Productized datasets
* Downstream models
* Data Product outputs
* Semantic assets
* Column-level dependencies

Use lineage when you need to answer:

```
Where does this dataset come from, and what depends on it?
```

Useful before using a dataset as an input: it shows whether changes to the dataset can affect downstream products, summaries, or semantic models.
{% endstep %}

{% step %}

## Toggle column lineage

In the Lineage tab, use the **Columns** toggle to switch between table-level and column-level lineage.

Use table-level lineage when you want a high-level view of data flow.

Use column-level lineage when you need to understand:

* Which input columns feed an output column.
* How a specific field is carried through transformations.
* Whether a candidate field is used downstream.
* Which downstream outputs a column change can affect.

This moves you from general discovery to precise impact analysis.
{% endstep %}

{% step %}

## Review Queries

Open the **Queries** tab to review recent query activity for the dataset.

Use this tab to inspect:

* Query submission time
* User who submitted the query
* Query frequency
* Recent access patterns

Ask:

```
Is this dataset actively used, and by whom?
```

Query activity tells you whether the dataset is commonly used, whether it supports current workflows, and who may have working knowledge of the data.
{% endstep %}

{% step %}

## Decide what to do next

After exploring the dataset metadata, decide on your next action.

<table><thead><tr><th width="266.83209228515625">Decision</th><th>When to choose it</th></tr></thead><tbody><tr><td>Productize the dataset</td><td>The dataset is useful and should be turned into a reusable Data Product. It has the fields, quality, and lineage needed for your use case.</td></tr><tr><td>Explore data values</td><td>Metadata looks relevant, but sample records or distributions need validation.</td></tr><tr><td>Inspect related Data Product</td><td>The dataset is already productized and may already solve the need.</td></tr><tr><td>Contact the owner</td><td>Meaning, access, or reliability is unclear.</td></tr><tr><td>Bring data in</td><td>Required data is missing from available datasets.</td></tr><tr><td>Avoid the dataset</td><td>Metadata shows it is incomplete, stale, unowned, or unsuitable.</td></tr></tbody></table>
{% endstep %}
{% endstepper %}

At the end of this task, you should be able to answer:

```
Is this dataset the right input for my Data Product?
```


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://v2.dataos.info/build/stage-1-discover/inspect-metadata/explore-metadata.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
Decision	When to choose it
Productize the dataset	The dataset is useful and should be turned into a reusable Data Product. It has the fields, quality, and lineage needed for your use case.
Explore data values	Metadata looks relevant, but sample records or distributions need validation.
Inspect related Data Product	The dataset is already productized and may already solve the need.
Contact the owner	Meaning, access, or reliability is unclear.
Bring data in	Required data is missing from available datasets.
Avoid the dataset	Metadata shows it is incomplete, stale, unowned, or unsuitable.