Feature metadata

feature_metadata produces one short, human-readable description per column, plus a dataset-level summary. The descriptions are generated by Claude using column statistics and sample values as context. They show up in the dataset’s Configuration sidebar, in column tooltips, and most importantly as grounding the chat agent reads when answering questions about your data. This component runs once per dataset version. Re-running only re-generates if the schema or context has changed.

Why it exists

Most dataset columns have names like ord_amt_usd_ttl or mau_d30 — meaningful to whoever set up the warehouse, opaque to anyone else. Without semantic context:

Summand has to guess what a column represents from the name alone.
Chart suggestions and view-builder field labels read like database internals.
New teammates can’t navigate the dataset without a SME walking them through it.

feature_metadata is the bridge. The Anthropic API call sees the column name, dtype, sample values, and column stats, and returns a one-sentence plain-English description plus a dataset-level summary.

Inputs

None. The component reads the curated Parquet sample and the column-stats artifact.

Output shape

{
  "context": "E-commerce orders for an enterprise SaaS company. Each row is an order placed by a customer, with payment, geography, and post-checkout fulfilment status.",
  "features": {
    "order_id": "Unique identifier for the order. Always populated.",
    "customer_id": "Foreign key to the customers table. Always populated.",
    "ord_amt_usd_ttl": "Total order amount in USD, including tax and shipping. Range $0–$84k, mean ~$430.",
    "ship_country": "ISO 3166-1 alpha-2 country code where the order shipped. 78% US.",
    "checkout_at": "Timestamp the customer completed checkout, in UTC."
  }
}

context is the dataset-level summary; features is keyed by column name. The artifact is feature_metadata.json.

Display

The Feature metadata component ships with a bespoke React viewer (FeatureMetadataView):

Dataset context rendered at the top.
Per-column descriptions in a sortable table alongside the column’s dtype and missingness.
Inline editing — you can override any description; overrides take precedence on re-run.

Where the descriptions show up

Dataset detail → Components tab — viewable like any other component.
Configuration sidebar — column tooltips show the description.
View builder — the field picker shows the description as a hint under each column name.
Summand chat — the agent reads the dataset context and per-column descriptions on every chat turn, grounding answers in business meaning rather than column-name guesses.

Filtering from chat

Summand can ask for a specific column’s metadata:

analyze({ component: "feature_metadata", target: ..., params: { column_name: "ord_amt_usd_ttl" } })

Returns just that column’s description plus the dataset-level context.

Compute profile

Profile	Memory	Timeout
Lambda	2 GB	600 s

The long timeout is because the component calls the Anthropic API for descriptions. It’s still typically under a minute end-to-end; the timeout is there to absorb occasional API slowness.

Privacy

Column descriptions are generated by sending the Anthropic API a small payload: column names, dtypes, basic stats from column_stats, and a handful of sample values. The full dataset is not sent. Anthropic is a listed subprocessor — see Compliance for the data-handling agreement. If your organization has policies against sending column samples to external APIs, contact enterprise@summand.com — Enterprise customers can scope LLM-using components per-org.

Override and refresh

Dataset owners can override any description from the Components tab — overrides persist across re-runs. To regenerate descriptions (e.g. after major schema changes), re-run the component manually from the Components tab or include it in a scheduled experiment.

Get started

Core concepts

Data sources

Guides

Account & billing

Resources

Feature metadata

Why it exists

Inputs

Output shape

Display

Where the descriptions show up

Filtering from chat

Compute profile

Privacy

Override and refresh

​Why it exists

​Inputs

​Output shape

​Display

​Where the descriptions show up

​Filtering from chat

​Compute profile

​Privacy

​Override and refresh

Why it exists

Inputs

Output shape

Display

Where the descriptions show up

Filtering from chat

Compute profile

Privacy

Override and refresh