Engineering reference: the meeting notetaker architecture
Same system, drawn for engineers. Region, service names, resource identifiers, Bedrock model IDs, Lambda inventory, IAM scopes, the Amazon Transcribe job config, EventBridge wiring, the DynamoDB schemas, and the approve-and-send flow. Read alongside the previous six posts; this one’s the build sheet.
Region and account shape
Default region: ap-southeast-1 (Singapore). Amazon Transcribe, Bedrock cross-Region inference, SES, and EventBridge are all available there. A second region for resilience isn’t worth the setup work at SMB volume — the failure mode for an SMB is one meeting’s recap landing a few minutes late, not a regional outage. One AWS account dedicated to the notetaker (separate from your other workloads) keeps the IAM blast radius small and lets a single AWS Budgets alarm cover the whole system.
Topology
Lambda functions
All Lambdas use the arm64 architecture, the smallest memory size that meets latency targets, Python 3.14 runtime, and CloudWatch Logs at 7-day retention. Each function has its own least-privilege IAM role. None run inside a VPC.
drive-sync— EventBridge Scheduler target, fires every few minutes. Uses the Google Drive API (service-account credentials in Secrets Manager undermn/drive/sa) to list new files in the recordings folder and copy anything unseen tos3://mn-recordings/, along with the matching sidecar if present. Same pattern syncs the style doc and roster tos3://mn-config-source/. Memory: 256 MB. Timeout: 60 s.presign— Lambda Function URL behind the private upload page. Issues a pre-signed S3PutObjectURL scoped to a single key unders3://mn-recordings/with a short expiry, and writes the supplied title/organizer/attendees into the sidecar object. The browser uploads directly to S3; no file passes through Lambda. Memory: 256 MB. Timeout: 15 s.mail-parser— S3 PUT trigger ons3://mn-raw-mime/. Parses MIME, extracts the audio/video attachment (or follows a download link in the body), and writes the media tos3://mn-recordings/with a sidecar derived from the email headers. Memory: 512 MB. Timeout: 120 s.start-transcribe— S3 PUT trigger ons3://mn-recordings/(media keys only; sidecar writes are ignored). CallsStartTranscriptionJobwithShowSpeakerLabels: true,MaxSpeakerLabelsfrom the sidecar attendee count, output tos3://mn-transcripts/, and a job name keyed to the meeting id. Memory: 256 MB. Timeout: 30 s. No Bedrock calls.notes— triggered by an EventBridge rule on the Transcribe job-state-change event (COMPLETED). Reads the transcript JSON, flattens it to a timestamped line list, maps speaker labels to roster names, then calls Bedrock: pass 1 for the summary, pass 2 for decisions + action items as JSON with a cited line per item. Selects Haiku 4.5 by default; Sonnet 4.6 when the audio duration exceeds the configured cutoff. Runs the four grounding checks (cite-or-drop, owner resolution, date sanity, assemble) in plain Python. Writes the draft tos3://mn-notes/. Memory: 1024 MB. Timeout: 120 s.recap— triggered afternoteswrites a draft (S3 PUT ons3://mn-notes/). Renders the draft as an HTML email and sends it to the organizer via SESSendRawEmail, with Approve/Edit/Discard links pointing at theackFunction URL carrying a signed token. If the meeting type has auto-send enabled and the draft has zero needs-confirm flags, skips the organizer step and sends straight to attendees. Memory: 256 MB. Timeout: 30 s.ack— Lambda Function URL, public withAuthType: NONE; verifies a signed token (HMAC with a secret in Secrets Manager) on every request. Handles Approve (send the clean recap to attendees via SES, mark runsent), Edit (render the edit form, apply submitted changes, then send), and Discard (close the run, send nothing). Writesmn-runson every action. Memory: 256 MB. Timeout: 15 s.digest— EventBridge Scheduler target, weekly Monday 8am. Readsmn-runsfor the past week and emails the organizer set a roll-up of meetings processed and any still-open action items. No Bedrock; a plain summary table. Memory: 256 MB.
Storage
- DynamoDB ·
mn-runs— one row per meeting run. PKmeeting_id; sort keyts; attributes:action(sent/edited/discarded),organizer,attendees,transcript_key,notes_key,snapshot(the recap that was sent). On-demand. No TTL — this is the long-term audit trail. - DynamoDB ·
mn-action-items— one row per action item across all meetings. PK(meeting_id, item_index); attributes:task,owner_email,due_date,cite_line,cite_ts,status(open/done). On-demand. Backs the weekly digest’s open-items view. - S3 ·
mn-recordings— uploaded media plus sidecars. Versioning enabled. Lifecycle to Glacier at 30 days; expiry at 1 year (recordings are the largest objects). - S3 ·
mn-transcripts— raw Transcribe JSON and the flattened transcript. Versioning enabled. - S3 ·
mn-notes— the draft notes JSON and the rendered recap HTML. Versioning enabled. - S3 ·
mn-raw-mime— raw inbound MIME from the email lane. Lifecycle to Glacier at 30 days; expiry at 1 year. - S3 ·
mn-config-source— mirrored style doc and roster as plain text/JSON. Versioning enabled.
Amazon Transcribe
- Job type. Asynchronous batch (
StartTranscriptionJob), not streaming — the recording is already a complete file, and batch is cheaper and simpler than streaming for after-the-fact notes. - Speaker labels.
Settings.ShowSpeakerLabels: true,MaxSpeakerLabelsset from the sidecar attendee count (default 10). Output written tos3://mn-transcripts/<meeting-id>.json. - Language.
IdentifyLanguage: trueby default so mixed-language teams work without per-job config; pinLanguageCodeif the team is always one language for slightly better accuracy. - Completion. Transcribe emits a
Transcribe Job State Changeevent to the default EventBridge bus; an EventBridge rule onCOMPLETEDtriggersnotes. No polling.
Bedrock
- Foundation models.
anthropic.claude-haiku-4-5-20251001-v1:0via the Global cross-Region inference profileglobal.anthropic.claude-haiku-4-5-20251001-v1:0for normal meetings;anthropic.claude-sonnet-4-6-20250930-v1:0viaglobal.anthropic.claude-sonnet-4-6-20250930-v1:0for meetings over the duration cutoff. Two callsites innotes: the summary pass and the action-item pass. - Embeddings. Not used. Notes come from the meeting’s own transcript; there’s no corpus to search. If a future feature needs retrieval across past meetings, the path would be Amazon Titan Text Embeddings V2 (1024-dim) into Amazon S3 Vectors — not added here.
- Prompting. Both passes are given the flat transcript and instructed to use only it, to cite the verbatim line and timestamp behind every claim, and to emit
needs_confirmrather than guess an owner or date. Pass 2 returns strict JSON validated against a schema before the grounding checks run. - Quotas. Default account quotas are more than enough at SMB volume — two short calls per meeting.
EventBridge wiring
mn-drive-sync— Scheduler,rate(5 minutes). Target:drive-syncLambda.mn-transcribe-complete— rule on the default bus matchingsource: aws.transcribe,detail-type: Transcribe Job State Change,detail.TranscriptionJobStatus: COMPLETEDwith a name prefix ofmn-. Target:notesLambda.mn-weekly-digest— Scheduler,cron(0 8 ? * MON *)inTZ_NAME. Target:digestLambda.- S3 notifications —
mn-recordingsPUT →start-transcribe;mn-raw-mimePUT →mail-parser;mn-notesPUT →recap. Suffix filters keep sidecar and JSON writes from triggering the wrong function.
SES outbound and inbound
- Set the MX record on a dedicated subdomain (e.g.
notes.your-company.com) toinbound-smtp.ap-southeast-1.amazonaws.comfor the forwarding lane. - SES inbound rule set
mn-inbound-rules: one rule with recipientnotes@your-company.com→ spam scan → S3 PUT tos3://mn-raw-mime/<message-id>→ stop. The S3 PUT triggersmail-parser. - SES outbound for the draft and the recap: verify a sender identity at
notes@your-company.comwith DKIM and SPF on the parent domain. Out of sandbox by request.
IAM (least privilege per Lambda)
Each Lambda has its own role with policies scoped to exact ARNs. Sketch:
- start-transcribe role:
s3:GetObjectonmn-recordings;transcribe:StartTranscriptionJob;s3:PutObjectonmn-transcripts. Nobedrock:*. - notes role:
s3:GetObjectonmn-transcriptsandmn-config-source;bedrock:InvokeModelon the Haiku and Sonnet ARNs;s3:PutObjectonmn-notes;dynamodb:PutItemonmn-action-items. - recap role:
s3:GetObjectonmn-notes;ses:SendRawEmailfrom the verified sender;secretsmanager:GetSecretValueon the token-signing secret. - ack role:
s3:GetObjectonmn-notes;ses:SendRawEmail;dynamodb:PutItemonmn-runsandUpdateItemonmn-action-items;secretsmanager:GetSecretValueon the token-signing secret. - drive-sync and presign roles:
secretsmanager:GetSecretValueon the Google service-account secret;s3:PutObjectonmn-recordingsandmn-config-source; outbound network towww.googleapis.com.presignalso needss3:PutObjectgrantable via pre-signed URL on the recordings prefix.
Approval flow and tokens
The draft email’s Approve/Edit/Discard links each carry a token: base64(meeting_id . action . expiry) with an HMAC signature using the secret in mn/token/signing. ack recomputes the HMAC and rejects any link whose signature doesn’t match or whose expiry has passed, so a forwarded email can’t be used to send a recap days later. Edit returns a minimal HTML form (no SPA, no build step) pre-filled with the needs-confirm items and the transcript line behind each. On submit, the form posts back to the same ack Function URL with the resolved owners and dates. Auto-send, when enabled for a meeting type, bypasses the token flow but only fires when the draft has zero needs-confirm flags.
Observability and cost gates
- CloudWatch Logs: all Lambdas, 7-day retention, structured JSON. Subscription filter on
"error"+"throttle"+"timeout"to a CloudWatch metric for alerting. - Alarms:
notesLambda failures > 0 in a day; Transcribe jobFAILEDevents > 0;acktoken-verification failures > 5/hour (might mean the signing secret rotated). - X-Ray: off by default. Not worth the cost at SMB volume.
- AWS Budgets: $30/month threshold, alarm at 80% and 100%, posts to SNS topic
mn-cost-alarmsubscribed to the admin’s email.
Config and secrets
The Google service-account credential for Drive lives in Secrets Manager under mn/drive/sa. The approval-token signing secret is mn/token/signing. The configured timezone, the long-meeting duration cutoff, the auto-send meeting-type list, the summary length, and the organizer-set address all live in Parameter Store under /mn/config/. The style doc and roster live in mn-config-source in S3, mirrored from Drive so a non-engineer can edit tone and people without a deploy. Lambdas fetch config on cold start and cache for the lifetime of the execution environment.
Deploy
GitHub Actions with OIDC into a deploy role (no long-lived keys) and AWS SAM for the stack. The opinionated bits: deploy the SES rule set as a separate stack (rule-set changes affect mail flow), turn on S3 versioning for every bucket so a bad sync can be rolled back in one click, and keep the EventBridge rule that matches Transcribe completion tight with a mn- job-name prefix so it never fires on unrelated jobs in the same account. Total deployable surface: around eight Lambdas, two DDB tables, six S3 buckets, two EventBridge rules plus the Scheduler rules, one SES rule set, and one Budgets alarm.
That’s the full system. Six narrative posts and this engineering reference. If you want to talk about adapting it for your business, see Work with me.
All posts