A meeting notetaker on AWS for a few dollars a month

Key takeaways

Three sources for recordings: a Drive folder, a direct upload link, and an email-forwarding lane.
Every meeting runs the same three steps: transcribe, write notes, send a recap after a human confirms.
The notes are grounded — every decision and action item traces back to a line in the transcript.
Unclear owners or dates are flagged for the organizer to confirm before the recap goes out.
Designed on AWS for about $4/month at typical small-business volume.

The whole system on one page

Before any code, here’s the shape of what we’re designing.

Fig 1. Three sources outside, three pieces inside AWS. Recordings flow in from a Drive folder, a direct upload link, and an email-forwarding lane. The Notes engine transcribes and then writes grounded notes. Recap sends the clean summary to everyone after the organizer confirms.

What you set up once (the outside)

Recordings. A Google Drive folder you drop meeting files into — audio (M4A, MP3, WAV) or video (MP4, MOV). One file per meeting. Next to each file you can drop a small sidecar text file naming the meeting and listing who was in the room; if you skip it, the notetaker still runs and just asks the organizer to fill in attendees at confirm time. New recordings can also enter via two other lanes covered in Part 2 — a direct upload link (a private page that puts the file straight into S3) and an email-forwarding lane (forward the recording your conferencing tool emails you, and it gets picked up automatically).
A style and people folder. Two short things in a Drive folder. The style doc says how the recap should read — how long the summary should be, how action items should be phrased (“Name to do X by date”), and any words to avoid. The roster is a short list of the team: name, email, and the way each person tends to be referred to in meetings (“Maria,” “M,” “the office manager”) so the notetaker can match a spoken name to a real email address. Editing either one doesn’t need a deploy.
Attendees. The people who were in the meeting. The clean recap email lands in their inboxes — but only after the organizer has confirmed the draft. Each recap has the summary, the list of decisions, and a table of action items with an owner and a due date for each, plus a link back to the transcript moment behind every item.

What runs on every meeting (the inside)

The recording intake. Three sources feed one place. A file dropped in Drive is mirrored to S3 by a small sync Lambda. A file from the upload link goes straight to S3. A forwarded recording is pulled from the email and written to S3 too. However it arrives, landing in S3 kicks off the next step: an Amazon Transcribe job. Transcribe reads the audio (it pulls audio out of video files on its own) and produces a transcript with speaker labels and a timestamp on every line.
The notes engine. Once the transcript is ready, two Bedrock calls run. The first writes a short summary of the meeting — what it was about, what got decided, where things stand. The second pulls out the action items: for each one, what needs doing, who owns it, and by when. Both calls are told to use only the transcript and to cite the line behind every decision and every action item. A long meeting (over about an hour) uses the heavier Claude Sonnet 4.6 model; a normal one uses the cheaper Claude Haiku 4.5. Anything the model is unsure about — a vague owner, a fuzzy date — is marked needs-confirm rather than guessed.
The recap. The pipeline emails the meeting organizer a draft recap with every needs-confirm item highlighted. The organizer can approve it as-is, fix an owner or a date inline, or drop an item. Only after they approve does the clean recap go to the whole attendee list via SES. Every run — the transcript, the draft, the final recap, and who confirmed it — is logged in DynamoDB so you can look back later and see exactly what was sent and why.

In plain words

Your Monday planning meeting runs 35 minutes. Six people, one recording. You drop the MP4 in the Drive folder and forget about it. Ten minutes later, Amazon Transcribe has turned it into text with speaker labels. Bedrock reads the transcript and writes: a three-sentence summary, two decisions (“Ship the pricing page Thursday,” “Pause the ad spend until the new landing page is live”), and four action items. Three of the action items are clean — clear owner, clear date. The fourth is fuzzy: somebody said “we should follow up with the vendor soon,” no name, no date. That one gets flagged needs-confirm. You, the organizer, get the draft. You assign the vendor follow-up to Maria, set it for Friday, and hit approve. The clean recap lands in all six inboxes two minutes later.

The cost of running this is about $4 a month at SMB volume. The cost of not running it is the decision everyone forgets, the action item that slips, and the half-hour each week somebody spends writing notes that still miss the one thing that mattered.

Design rules that shaped every decision

Only what was said. Every decision and action item cites the transcript line behind it — nothing is invented.
Three steps, always. Transcribe, write notes, send a recap. There is no fourth.
A human confirms before anyone else sees it. The organizer approves the draft; the recap never auto-sends by default.
Unclear is flagged, not guessed. A vague owner or fuzzy date becomes a needs-confirm item, not a made-up answer.
The recordings live in Drive. Adding a meeting, changing the roster, or editing the style doesn’t need a deploy.
Every run is logged. Look back next quarter and you can see exactly what recap went out and who confirmed it.

Why this shape

Most teams handle meeting notes in one of three ways: somebody volunteers and does a rushed job, nobody does and the meeting evaporates, or they pay for an always-on bot that joins every call and quietly records things they’d rather it didn’t. The volunteer approach fails the moment that person is busy or away. The nobody approach fails every time. And the always-on bot is more system than a small business wants — another vendor in every meeting, another subscription, another thing recording when it shouldn’t be.

The setup above is deliberately the opposite of always-on. Nothing joins your calls. You decide which meetings get notes by choosing which recordings to drop in. The notetaker wakes up only when a file appears, does its work in a few minutes, and goes back to sleep. The notes are grounded in the transcript, so they don’t drift into things nobody said. And a human signs off before the recap reaches the room, so a misheard owner or a wrong date gets caught before it becomes an email everyone trusts.

The next four posts walk through each piece in turn: how a recording comes in, how the transcript becomes notes, how the notes stay grounded, and how the recap reaches everyone. One diagram per post. A cost breakdown and a final engineering reference at the end.

All posts