The semantic layer is the structured set of artifacts Summand computes for every dataset: column statistics, feature descriptions, a trained interpretable model, feature-effect graphs, a 2D embedding, and dataset-level metadata. Together these are what power dashboards, AI insights, and the MCP server. Each artifact is produced by a semantic component — a self-contained unit that takes the curated dataset and emits a JSON (or binary) artifact, versioned and stored in S3.Documentation Index
Fetch the complete documentation index at: https://docs.summand.com/llms.txt
Use this file to discover all available pages before exploring further.
Why components
Splitting the semantic layer into discrete components has a few practical consequences:- Computed once, served many times. Models are trained on a schedule; downstream calls just read artifacts.
- Independent failure. Optional components can fail without breaking the rest of the layer.
- Versioned. Every refresh produces a new immutable version with its own manifest.
- AI-queryable. Each component declares how an AI agent can summarize and filter its data, so MCP tools can return the right slice without dumping the whole artifact into context.
What gets produced
Every successful run yields a manifest plus one artifact per component, written to:semantic-layers/{datasetId}/current/version.json always resolves to the latest completed version.
Component reference
Full breakdown of every component, its fields, and how to query it.
Pipeline shape
A run is orchestrated as a DAG. Components declare their dependencies and the orchestrator topologically sorts them:feature_metadata, ebm_model, ebm_graphs, semantic_metadata) crash the run on failure. Optional components (column_stats, umap_embedding) are best-effort — the run continues without them.
How clients consume it
- Fetch the dataset record and read its
semanticLayer.manifest. - Pick a component by name.
- Request a presigned URL for that component’s artifact.
- Decompress (
gzipif the manifest says so) and apply any filtering described inagentConfig.params.
get_semantic_data tool, which handles fetching, decompression, and parameter filtering automatically.

