Part 7 of 7 · Meeting notetaker series ~8 min read

Engineering reference: the meeting notetaker architecture

Same system, drawn for engineers. Region, service names, resource identifiers, Bedrock model IDs, Lambda inventory, IAM scopes, the Amazon Transcribe job config, EventBridge wiring, the DynamoDB schemas, and the approve-and-send flow. Read alongside the previous six posts; this one’s the build sheet.

Region and account shape

Default region: ap-southeast-1 (Singapore). Amazon Transcribe, Bedrock cross-Region inference, SES, and EventBridge are all available there. A second region for resilience isn’t worth the setup work at SMB volume — the failure mode for an SMB is one meeting’s recap landing a few minutes late, not a regional outage. One AWS account dedicated to the notetaker (separate from your other workloads) keeps the IAM blast radius small and lets a single AWS Budgets alarm cover the whole system.

Topology

AWS topology of the meeting notetaker A topology diagram with three regions stacked vertically inside one AWS account boundary. Top region: ingress. Three boxes show the three intake lanes — a Drive folder sync via the drive-sync Lambda triggered every few minutes by EventBridge Scheduler that mirrors new recordings to s3://mn-recordings/, an upload link backed by a presign Lambda Function URL that hands the browser a pre-signed S3 URL so the file goes straight to the recordings bucket, and an SES inbound rule set with action S3 PUT to s3://mn-raw-mime/ plus the mail-parser Lambda that pulls the attached recording out and writes it to the recordings bucket. Middle region: event processing. An S3 PUT on the recordings bucket triggers the start-transcribe Lambda, which launches an Amazon Transcribe job with speaker labels; on job completion an EventBridge rule triggers the notes Lambda, which reads the transcript, calls Bedrock Haiku 4.5 (or Sonnet 4.6 for long meetings) twice for the summary and the action items, runs the four grounding checks in plain code, and writes the draft to s3://mn-notes/. Bottom region: recap and approval. The recap Lambda emails the draft to the organizer via SES outbound with Approve, Edit, and Discard links. Those links hit the ack Lambda Function URL, which on approve or edit sends the clean recap to all attendees via SES outbound and writes the run to DynamoDB mn-runs; on discard it closes the run without sending. CloudWatch Logs collects from every Lambda at 7-day retention. Across the right edge: a small box labelled AWS Budgets alarm at $30 monthly threshold, posting to SNS topic mn-cost-alarm. A note at the bottom: every recap is confirmed by a human — and every run is logged to mn-runs. Ingress Lambda · drive-sync every few min Drive API → s3://mn-recordings/ new files Function URL · presign mn-upload-link pre-signed PUT s3://mn-recordings/ browser uploads direct SES inbound rule set mn-inbound-rules action: S3 PUT s3://mn-raw-mime/ trigger: mail-parser Recordings bucket in S3 one place to read · S3 PUT event Event processing Lambda · start-transcribe S3 PUT trigger StartTranscriptionJob speaker labels on output → mn-transcripts Lambda · notes on job complete 2 Bedrock passes, 4 grounding checks, draft → mn-notes Amazon Transcribe async batch job audio or video in JSON transcript out (no Bedrock here) Recap & approval Lambda · recap emails the draft to the organizer; SES outbound with action links Approval page [Approve] [Edit] [Discard] link clicks → Function URL Lambda · ack on approve/edit, SES sends to all; writes mn-runs; discard closes run Every recap is confirmed by a human — and every run is logged to mn-runs.
Fig 7. AWS topology, in three regions of the diagram: ingress (three lanes into the recordings bucket), event processing (transcribe, then the grounded notes draft), recap and approval (the organizer confirms and the recap ships). Every Lambda is event-driven; nothing is synchronous-chained.

Lambda functions

All Lambdas use the arm64 architecture, the smallest memory size that meets latency targets, Python 3.14 runtime, and CloudWatch Logs at 7-day retention. Each function has its own least-privilege IAM role. None run inside a VPC.

  • drive-sync — EventBridge Scheduler target, fires every few minutes. Uses the Google Drive API (service-account credentials in Secrets Manager under mn/drive/sa) to list new files in the recordings folder and copy anything unseen to s3://mn-recordings/, along with the matching sidecar if present. Same pattern syncs the style doc and roster to s3://mn-config-source/. Memory: 256 MB. Timeout: 60 s.
  • presign — Lambda Function URL behind the private upload page. Issues a pre-signed S3 PutObject URL scoped to a single key under s3://mn-recordings/ with a short expiry, and writes the supplied title/organizer/attendees into the sidecar object. The browser uploads directly to S3; no file passes through Lambda. Memory: 256 MB. Timeout: 15 s.
  • mail-parser — S3 PUT trigger on s3://mn-raw-mime/. Parses MIME, extracts the audio/video attachment (or follows a download link in the body), and writes the media to s3://mn-recordings/ with a sidecar derived from the email headers. Memory: 512 MB. Timeout: 120 s.
  • start-transcribe — S3 PUT trigger on s3://mn-recordings/ (media keys only; sidecar writes are ignored). Calls StartTranscriptionJob with ShowSpeakerLabels: true, MaxSpeakerLabels from the sidecar attendee count, output to s3://mn-transcripts/, and a job name keyed to the meeting id. Memory: 256 MB. Timeout: 30 s. No Bedrock calls.
  • notes — triggered by an EventBridge rule on the Transcribe job-state-change event (COMPLETED). Reads the transcript JSON, flattens it to a timestamped line list, maps speaker labels to roster names, then calls Bedrock: pass 1 for the summary, pass 2 for decisions + action items as JSON with a cited line per item. Selects Haiku 4.5 by default; Sonnet 4.6 when the audio duration exceeds the configured cutoff. Runs the four grounding checks (cite-or-drop, owner resolution, date sanity, assemble) in plain Python. Writes the draft to s3://mn-notes/. Memory: 1024 MB. Timeout: 120 s.
  • recap — triggered after notes writes a draft (S3 PUT on s3://mn-notes/). Renders the draft as an HTML email and sends it to the organizer via SES SendRawEmail, with Approve/Edit/Discard links pointing at the ack Function URL carrying a signed token. If the meeting type has auto-send enabled and the draft has zero needs-confirm flags, skips the organizer step and sends straight to attendees. Memory: 256 MB. Timeout: 30 s.
  • ack — Lambda Function URL, public with AuthType: NONE; verifies a signed token (HMAC with a secret in Secrets Manager) on every request. Handles Approve (send the clean recap to attendees via SES, mark run sent), Edit (render the edit form, apply submitted changes, then send), and Discard (close the run, send nothing). Writes mn-runs on every action. Memory: 256 MB. Timeout: 15 s.
  • digest — EventBridge Scheduler target, weekly Monday 8am. Reads mn-runs for the past week and emails the organizer set a roll-up of meetings processed and any still-open action items. No Bedrock; a plain summary table. Memory: 256 MB.

Storage

  • DynamoDB · mn-runs — one row per meeting run. PK meeting_id; sort key ts; attributes: action (sent/edited/discarded), organizer, attendees, transcript_key, notes_key, snapshot (the recap that was sent). On-demand. No TTL — this is the long-term audit trail.
  • DynamoDB · mn-action-items — one row per action item across all meetings. PK (meeting_id, item_index); attributes: task, owner_email, due_date, cite_line, cite_ts, status (open/done). On-demand. Backs the weekly digest’s open-items view.
  • S3 · mn-recordings — uploaded media plus sidecars. Versioning enabled. Lifecycle to Glacier at 30 days; expiry at 1 year (recordings are the largest objects).
  • S3 · mn-transcripts — raw Transcribe JSON and the flattened transcript. Versioning enabled.
  • S3 · mn-notes — the draft notes JSON and the rendered recap HTML. Versioning enabled.
  • S3 · mn-raw-mime — raw inbound MIME from the email lane. Lifecycle to Glacier at 30 days; expiry at 1 year.
  • S3 · mn-config-source — mirrored style doc and roster as plain text/JSON. Versioning enabled.

Amazon Transcribe

  • Job type. Asynchronous batch (StartTranscriptionJob), not streaming — the recording is already a complete file, and batch is cheaper and simpler than streaming for after-the-fact notes.
  • Speaker labels. Settings.ShowSpeakerLabels: true, MaxSpeakerLabels set from the sidecar attendee count (default 10). Output written to s3://mn-transcripts/<meeting-id>.json.
  • Language. IdentifyLanguage: true by default so mixed-language teams work without per-job config; pin LanguageCode if the team is always one language for slightly better accuracy.
  • Completion. Transcribe emits a Transcribe Job State Change event to the default EventBridge bus; an EventBridge rule on COMPLETED triggers notes. No polling.

Bedrock

  • Foundation models. anthropic.claude-haiku-4-5-20251001-v1:0 via the Global cross-Region inference profile global.anthropic.claude-haiku-4-5-20251001-v1:0 for normal meetings; anthropic.claude-sonnet-4-6-20250930-v1:0 via global.anthropic.claude-sonnet-4-6-20250930-v1:0 for meetings over the duration cutoff. Two callsites in notes: the summary pass and the action-item pass.
  • Embeddings. Not used. Notes come from the meeting’s own transcript; there’s no corpus to search. If a future feature needs retrieval across past meetings, the path would be Amazon Titan Text Embeddings V2 (1024-dim) into Amazon S3 Vectors — not added here.
  • Prompting. Both passes are given the flat transcript and instructed to use only it, to cite the verbatim line and timestamp behind every claim, and to emit needs_confirm rather than guess an owner or date. Pass 2 returns strict JSON validated against a schema before the grounding checks run.
  • Quotas. Default account quotas are more than enough at SMB volume — two short calls per meeting.

EventBridge wiring

  • mn-drive-sync — Scheduler, rate(5 minutes). Target: drive-sync Lambda.
  • mn-transcribe-complete — rule on the default bus matching source: aws.transcribe, detail-type: Transcribe Job State Change, detail.TranscriptionJobStatus: COMPLETED with a name prefix of mn-. Target: notes Lambda.
  • mn-weekly-digest — Scheduler, cron(0 8 ? * MON *) in TZ_NAME. Target: digest Lambda.
  • S3 notificationsmn-recordings PUT → start-transcribe; mn-raw-mime PUT → mail-parser; mn-notes PUT → recap. Suffix filters keep sidecar and JSON writes from triggering the wrong function.

SES outbound and inbound

  • Set the MX record on a dedicated subdomain (e.g. notes.your-company.com) to inbound-smtp.ap-southeast-1.amazonaws.com for the forwarding lane.
  • SES inbound rule set mn-inbound-rules: one rule with recipient notes@your-company.com → spam scan → S3 PUT to s3://mn-raw-mime/<message-id> → stop. The S3 PUT triggers mail-parser.
  • SES outbound for the draft and the recap: verify a sender identity at notes@your-company.com with DKIM and SPF on the parent domain. Out of sandbox by request.

IAM (least privilege per Lambda)

Each Lambda has its own role with policies scoped to exact ARNs. Sketch:

  • start-transcribe role: s3:GetObject on mn-recordings; transcribe:StartTranscriptionJob; s3:PutObject on mn-transcripts. No bedrock:*.
  • notes role: s3:GetObject on mn-transcripts and mn-config-source; bedrock:InvokeModel on the Haiku and Sonnet ARNs; s3:PutObject on mn-notes; dynamodb:PutItem on mn-action-items.
  • recap role: s3:GetObject on mn-notes; ses:SendRawEmail from the verified sender; secretsmanager:GetSecretValue on the token-signing secret.
  • ack role: s3:GetObject on mn-notes; ses:SendRawEmail; dynamodb:PutItem on mn-runs and UpdateItem on mn-action-items; secretsmanager:GetSecretValue on the token-signing secret.
  • drive-sync and presign roles: secretsmanager:GetSecretValue on the Google service-account secret; s3:PutObject on mn-recordings and mn-config-source; outbound network to www.googleapis.com. presign also needs s3:PutObject grantable via pre-signed URL on the recordings prefix.

Approval flow and tokens

The draft email’s Approve/Edit/Discard links each carry a token: base64(meeting_id . action . expiry) with an HMAC signature using the secret in mn/token/signing. ack recomputes the HMAC and rejects any link whose signature doesn’t match or whose expiry has passed, so a forwarded email can’t be used to send a recap days later. Edit returns a minimal HTML form (no SPA, no build step) pre-filled with the needs-confirm items and the transcript line behind each. On submit, the form posts back to the same ack Function URL with the resolved owners and dates. Auto-send, when enabled for a meeting type, bypasses the token flow but only fires when the draft has zero needs-confirm flags.

Observability and cost gates

  • CloudWatch Logs: all Lambdas, 7-day retention, structured JSON. Subscription filter on "error" + "throttle" + "timeout" to a CloudWatch metric for alerting.
  • Alarms: notes Lambda failures > 0 in a day; Transcribe job FAILED events > 0; ack token-verification failures > 5/hour (might mean the signing secret rotated).
  • X-Ray: off by default. Not worth the cost at SMB volume.
  • AWS Budgets: $30/month threshold, alarm at 80% and 100%, posts to SNS topic mn-cost-alarm subscribed to the admin’s email.

Config and secrets

The Google service-account credential for Drive lives in Secrets Manager under mn/drive/sa. The approval-token signing secret is mn/token/signing. The configured timezone, the long-meeting duration cutoff, the auto-send meeting-type list, the summary length, and the organizer-set address all live in Parameter Store under /mn/config/. The style doc and roster live in mn-config-source in S3, mirrored from Drive so a non-engineer can edit tone and people without a deploy. Lambdas fetch config on cold start and cache for the lifetime of the execution environment.

Deploy

GitHub Actions with OIDC into a deploy role (no long-lived keys) and AWS SAM for the stack. The opinionated bits: deploy the SES rule set as a separate stack (rule-set changes affect mail flow), turn on S3 versioning for every bucket so a bad sync can be rolled back in one click, and keep the EventBridge rule that matches Transcribe completion tight with a mn- job-name prefix so it never fires on unrelated jobs in the same account. Total deployable surface: around eight Lambdas, two DDB tables, six S3 buckets, two EventBridge rules plus the Scheduler rules, one SES rule set, and one Budgets alarm.

That’s the full system. Six narrative posts and this engineering reference. If you want to talk about adapting it for your business, see Work with me.

All posts