CSV upload is Summand’s no-setup entry point. It’s available on every tier, including Free.Documentation Index
Fetch the complete documentation index at: https://docs.summand.com/llms.txt
Use this file to discover all available pages before exploring further.
Limits
| Tier | Maximum file size |
|---|---|
| Free | 50 MB |
| Pro / Education | 1 GB |
| Enterprise | 4 GB |
File format
UTF-8 encoded text
Other encodings (UTF-16, ISO-8859-1, Windows-1252) often work but aren’t guaranteed. If column names look garbled in the preview, re-export as UTF-8.
One header row
The first non-empty row must contain column names. Names should be unique and human-readable; Summand uses them in the UI verbatim.
Comma, tab, or semicolon delimited
Auto-detected from the file extension and the first line.
.csv, .tsv, .txt are all accepted.What happens when you upload
- The file streams directly to Summand’s encrypted S3 bucket via a presigned URL — it doesn’t pass through any application server.
- Summand parses the header, samples the first ~10,000 rows to infer column types, and creates the dataset.
- The full file is converted to Parquet on the analysis pipeline’s first run.
- The analysis pipeline produces a semantic-layer version and the dataset becomes browseable.
Re-uploading
To analyze a new version of the same file, click Replace data on the dataset page and upload again. Summand creates a new semantic-layer version pinned to the new file; the previous version remains accessible for comparison. The dataset’s schema overrides, context, sharing grants, and any experiments configured on it are preserved across re-uploads — you don’t redo configuration.Troubleshooting
Upload fails partway through
Upload fails partway through
Large uploads are resumable up to 24 hours. Refresh the page; you’ll be prompted to resume rather than restart. If the upload still fails, the file may be corrupted — try opening it in a text editor or spreadsheet first to confirm it’s well-formed.
Columns show up as "string" when they should be numeric
Columns show up as "string" when they should be numeric
Most often this is non-numeric content in the column — currency symbols, units, commas as thousands separators. Either clean the file, or override the column type from the Configuration sidebar after upload.
My Predictors experiment found no useful features
My Predictors experiment found no useful features
Three common causes: (1) the target is highly imbalanced — fewer than ~30 minority-class rows; (2) too many features are high-cardinality strings excluded from the model; (3) the file is too small — EBM needs roughly 500+ rows for stable shape functions.