Documentation Index
Fetch the complete documentation index at: https://docs.summand.com/llms.txt
Use this file to discover all available pages before exploring further.
feature_metadata produces one short, human-readable description per column, plus a dataset-level summary. The descriptions are generated by Claude using column statistics and sample values as context. They show up in the dataset’s Configuration sidebar, in column tooltips, and most importantly as grounding the chat agent reads when answering questions about your data.
This component runs once per dataset version. Re-running only re-generates if the schema or context has changed.
Why it exists
Most dataset columns have names likeord_amt_usd_ttl or mau_d30 — meaningful to whoever set up the warehouse, opaque to anyone else. Without semantic context:
- Summand has to guess what a column represents from the name alone.
- Chart suggestions and view-builder field labels read like database internals.
- New teammates can’t navigate the dataset without a SME walking them through it.
feature_metadata is the bridge. The Anthropic API call sees the column name, dtype, sample values, and column stats, and returns a one-sentence plain-English description plus a dataset-level summary.
Inputs
None. The component reads the curated Parquet sample and the column-stats artifact.Output shape
context is the dataset-level summary; features is keyed by column name. The artifact is feature_metadata.json.
Display
The Feature metadata component ships with a bespoke React viewer (FeatureMetadataView):
- Dataset context rendered at the top.
- Per-column descriptions in a sortable table alongside the column’s dtype and missingness.
- Inline editing — you can override any description; overrides take precedence on re-run.
Where the descriptions show up
- Dataset detail → Components tab — viewable like any other component.
- Configuration sidebar — column tooltips show the description.
- View builder — the field picker shows the description as a hint under each column name.
- Summand chat — the agent reads the dataset context and per-column descriptions on every chat turn, grounding answers in business meaning rather than column-name guesses.
Filtering from chat
Summand can ask for a specific column’s metadata:Compute profile
| Profile | Memory | Timeout |
|---|---|---|
| Lambda | 2 GB | 600 s |
Privacy
Column descriptions are generated by sending the Anthropic API a small payload: column names, dtypes, basic stats fromcolumn_stats, and a handful of sample values. The full dataset is not sent. Anthropic is a listed subprocessor — see Compliance for the data-handling agreement.
If your organization has policies against sending column samples to external APIs, contact enterprise@summand.com — Enterprise customers can scope LLM-using components per-org.