Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.summand.com/llms.txt

Use this file to discover all available pages before exploring further.

An experiment is a recurring run of one or more components (predictors, surprise finding, column stats, custom analyses) against a dataset or view. Experiments are how recurring analysis happens in Summand — instead of remembering to re-run a notebook, you wire up an experiment and it runs forever.

Anatomy

FieldPurpose
experimentIdStable identifier.
name / descriptionHuman-readable.
datasetIdThe source. Required.
viewIdOptional. If set, the experiment runs against the view rather than the raw dataset.
components[]List of component IDs to run. At least one.
inputsA dict keyed by ComponentInput.name — typed values supplied to each component.
schedulePreset / cronExpressionWhen to run.
enabledPause without deleting.
runCount, lastRunAt, recentRuns[]Run history (last 3 surfaced inline; full history under Run history).
Experiments are stored as DynamoDB items keyed under the user. EventBridge Scheduler is what actually fires them on cron — Summand’s runtime is Lambda + Step Functions, not a long-running scheduler service.

Schedule options

Six presets cover the common cases:
  • Daily at 3 AM (cron(0 3 * * ? *))
  • Daily at 6 AM
  • Every 6 hours
  • Every 12 hours
  • Weekly Monday 3 AM
  • Weekly Sunday 3 AM
Or write your own AWS EventBridge cron / rate expression:
cron(*/15 * * * ? *)        — every 15 minutes
cron(0 9 ? * MON-FRI *)     — every weekday at 9 AM
cron(0 0 1 * ? *)           — first of every month at midnight
rate(2 hours)               — every two hours
Note that EventBridge cron uses six fields, not the standard five — the day-of-week field has special semantics. The in-product editor validates and surfaces a human-readable description (e.g. “At minute 0, every 6 hours”) so you can sanity-check before saving.

Components

A component is a unit of analysis with declared inputs and outputs. The component catalog defines what’s available; today it includes:
  • Predictors — fit an EBM, score the source, write feature importance and shape functions.
  • Surprise finding — flag rows that disagree with predictions; rank by confidence.
  • Column stats — per-column distributions, missingness, cardinality, correlations.
  • Semantic layer pieces — UMAP, feature metadata, and similar internal artifacts.
Each component has typed inputs that the experiment editor renders as form fields (number inputs for thresholds, dropdowns for column references, etc.). For example, the Predictors component has an input for “target column” — that’s where target column lives in the product today, not on the dataset. See Components for the full catalog and what each one does.

Sources: dataset vs. view

You can target either a dataset or a view:
  • Dataset — components run on the curated Parquet for that dataset. The simplest case.
  • View — components run on the SQL view’s output. Use this when you need to filter, join, or aggregate first, or when you want experiments to share a common definition (define the cohort once as a view, point multiple experiments at it).
Views referenced by an experiment must be owned by the experiment’s owner — there’s no shared-view-as-source pattern today.

Outputs

Each run writes:
  • A new semantic-layer version for the source dataset.
  • Component outputs to summand-task-outputs keyed by run ID.
  • A row in Run history with status, duration, and any error.
Outputs are queryable from chat (“what did the last predictor run say?”), readable in the global Surprises page (for surprise-finding components), and feed downstream views you build on top.

Run history

The experiment detail page shows the most recent runs with status (SUCCEEDED / FAILED / IN_PROGRESS), duration, and (on failure) the error message. Older runs paginate further back. Failed runs are kept — Summand doesn’t auto-retry. You can:
  • Manually re-run from the experiment detail page (a one-shot trigger; doesn’t change the schedule).
  • Edit and re-save to fix the configuration; the next scheduled tick picks up the change.
  • Pause if the underlying source is broken upstream and you don’t want to spam failures.

Manual triggers

Hit the Run now button to fire an experiment outside its schedule. The run shows up in history alongside scheduled runs; the next scheduled tick runs as normal. This is the right way to test a configuration before letting it go on cron.