Part 2 of 7 · Survey analyzer series ~4 min read

How a survey answer gets collected

The analyzer can only make sense of answers it actually has. So the first job is making sure every free-text response ends up in one place. There are three ways an answer gets in: your survey form posts it to a small endpoint the moment someone hits send, somebody forwards an email reply or an export to a dedicated address, or you paste an export into a Drive sheet. The form lane is the backbone. The other two exist because in real life feedback arrives in a dozen shapes and nobody re-keys it by hand.

Key takeaways

  • Three intake lanes feed one store: a form-submit lane, an inbox-forwarding lane, and a Drive sheet.
  • The form lane posts each answer to a Lambda Function URL the moment it is submitted.
  • Forwarded emails and exports arrive via SES inbound; a parser pulls the answer text out.
  • Every answer is cleaned and gets a quick urgent check before it lands in the store.
  • The store is the single place the grouper reads from — the other lanes just write into it.

Three lanes into one store

Three intake lanes funnel into one answer store A diagram with three vertical lane columns at the top and a single unified row at the bottom. Lane one, Form submit: your survey form posts each response to a Lambda Function URL the moment someone hits send; the endpoint checks a shared secret, cleans the text, and writes the answer to the store. Lane two, Inbox forwarding: somebody forwards an email reply or an export to a dedicated address, feedback-at-your-company; SES writes the raw MIME to S3; a parser Lambda pulls the answer text out of the email body or the attached export, cleans it, and writes it to the store. Lane three, Drive sheet: you paste a survey export into a Google Sheet; the drive-sync Lambda mirrors the sheet to S3 every 15 minutes, and a small importer reads any new rows and writes them to the store. All three lanes converge on the same answer store in DynamoDB, where each answer is cleaned and gets a quick urgent check before it settles. A note at the bottom: the store is the one place the grouper reads — the other lanes are conveniences that write into it. Lane 1 · on submit Form submit • Form posts each answer on send • Hits a Function URL with a shared secret • Cleaned, then written to the store • The backbone lane — real time Lane 2 · SES inbound Inbox forwarding • Forward reply to feedback-address • SES writes MIME to S3 • Parser pulls the answer text out • Cleaned → written to store Lane 3 · pasted export Drive sheet • Paste an export into the sheet • drive-sync mirrors to S3 every 15 min • Importer reads any new rows • Cleaned → written to store Answer store (single source of truth) id · survey · date · rating · text · cleaned · urgent flag · theme each answer cleaned and urgent-checked — grouper reads from here to grouper, weekly The store is the single source of truth — the other lanes are conveniences that write into it.
Fig 2. Three lanes converge on one answer store. The form lane is the backbone; the inbox lane and the sheet lane are conveniences that write into the same store. Each answer is cleaned and urgent-checked before it settles, and the grouper reads only from the store.

Lane 1: the form-submit lane (the backbone)

The main lane. Most survey tools and web forms can send a copy of each submission to a URL the moment it’s filled in. You point that at a small Lambda Function URL — a plain web address that runs a little code, with no API Gateway in front of it. The endpoint checks a shared secret so only your form can post to it, reads the fields you care about (the free-text answer, plus any survey name, date, and rating), cleans the text, runs the quick urgent check from Part 5, and writes the answer to the store in DynamoDB. The whole thing takes a few hundred milliseconds and costs a fraction of a cent.

This lane covers the steady state: someone fills in your survey, and seconds later the answer is in the store, cleaned and checked. Most answers arrive this way.

Lane 2: inbox forwarding (for replies that arrive as email)

Plenty of feedback never goes through a form. A customer replies to your “how did we do?” email with a paragraph. A team member exports last quarter’s answers from an old tool and wants them in the same place. Set up a dedicated inbound address — something like feedback@your-company.com — via Amazon SES. Anyone forwards a reply or an export to that address and the analyzer takes it from there. SES writes the raw email to S3. That triggers a parser Lambda that walks the email, pulls the answer text out of the body (or out of a simple CSV or spreadsheet attachment), strips the signature and the quoted “on Monday you wrote” trail, cleans it, and writes one answer per response into the store.

The parser leans on plain code for the common cases — a plain-text body, a CSV column — and only reaches for a small model call when the layout is genuinely ambiguous. Most forwarded feedback is plain enough that no model is needed at all.

Lane 3: the Drive sheet lane

The simplest lane for a bulk paste. You keep a Google Sheet in a Drive folder with one column for the answer text and a few optional columns for survey name, date, and rating. Paste an export in and you’re done. A small Lambda — drive-sync — runs every fifteen minutes, exports the sheet as plain CSV via the Drive API, and writes it to s3://sa-answers-source/answers.csv if the sheet has changed since the last sync. A tiny importer reads any rows it hasn’t seen before, cleans them, runs the urgent check, and writes them to the store. Reading from S3 keeps Drive API calls predictable and gives you S3 versioning for free, so a bad paste can be rolled back in one click.

This lane is the most hands-on of the three, but it’s the one that needs no setup beyond a shared sheet — handy for a team migrating old feedback or running an occasional one-off survey.

Why everything funnels into one store

Three lanes in, but only one place the grouper actually reads. That’s a deliberate constraint. If each lane wrote its answers somewhere different, every “why didn’t this show up in the summary?” question would mean checking three places and reconciling three formats. Funneling everything through one store means there is exactly one shape for an answer — id, survey, date, rating, raw text, cleaned text, urgent flag, and (once grouped) its theme — and exactly one place to look. The convenience lanes are first-class for getting answers in, but they always land in the same store on the way.

Next post: how the grouper reads the store, turns each answer into a vector, clusters the vectors, and names the themes.

All posts