Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.summand.com/llms.txt

Use this file to discover all available pages before exploring further.

CSV upload is Summand’s no-setup entry point. It’s available on every tier, including Free.

Limits

TierMaximum file size
Free50 MB
Pro / Education1 GB
Enterprise4 GB
For files larger than 4 GB, point Summand at the source via a database connector or Fivetran instead — uploads aren’t the right tool for very large data.

File format

1

UTF-8 encoded text

Other encodings (UTF-16, ISO-8859-1, Windows-1252) often work but aren’t guaranteed. If column names look garbled in the preview, re-export as UTF-8.
2

One header row

The first non-empty row must contain column names. Names should be unique and human-readable; Summand uses them in the UI verbatim.
3

Comma, tab, or semicolon delimited

Auto-detected from the file extension and the first line. .csv, .tsv, .txt are all accepted.
4

Long format, not wide

One row per observation, one column per variable. Pivot tables and crosstabs produce noisy analyses — un-pivot first.

What happens when you upload

  1. The file streams directly to Summand’s encrypted S3 bucket via a presigned URL — it doesn’t pass through any application server.
  2. Summand parses the header, samples the first ~10,000 rows to infer column types, and creates the dataset.
  3. The full file is converted to Parquet on the analysis pipeline’s first run.
  4. The analysis pipeline produces a semantic-layer version and the dataset becomes browseable.
For a 100 MB file, end-to-end is typically under a minute.

Re-uploading

To analyze a new version of the same file, click Replace data on the dataset page and upload again. Summand creates a new semantic-layer version pinned to the new file; the previous version remains accessible for comparison. The dataset’s schema overrides, context, sharing grants, and any experiments configured on it are preserved across re-uploads — you don’t redo configuration.

Troubleshooting

Large uploads are resumable up to 24 hours. Refresh the page; you’ll be prompted to resume rather than restart. If the upload still fails, the file may be corrupted — try opening it in a text editor or spreadsheet first to confirm it’s well-formed.
Most often this is non-numeric content in the column — currency symbols, units, commas as thousands separators. Either clean the file, or override the column type from the Configuration sidebar after upload.
Three common causes: (1) the target is highly imbalanced — fewer than ~30 minority-class rows; (2) too many features are high-cardinality strings excluded from the model; (3) the file is too small — EBM needs roughly 500+ rows for stable shape functions.