Part 7 of 7 · FAQ builder series ~8 min read

Engineering reference: the FAQ builder architecture

Same system, drawn for engineers. Region, service names, resource identifiers, Bedrock model IDs, Lambda inventory, IAM scopes, the SES inbound rule set, S3 Vectors config, EventBridge Scheduler config, the DynamoDB schemas, and the Slack interactive flow. Read alongside the previous six posts; this one’s the build sheet.

Region and account shape

Default region: ap-southeast-1 (Singapore). SES inbound, S3 Vectors, Bedrock cross-Region inference (Titan Text Embeddings V2 and Claude Haiku 4.5), and EventBridge Scheduler are all available there. A second region for multi-region resilience isn’t worth the extra setup work at SMB volume — the failure mode for an SMB is a delayed FAQ proposal, not a regional outage. One AWS account dedicated to the builder (separate from your other workloads) keeps the IAM blast radius small and lets a single AWS Budgets alarm cover the whole system.

Topology

AWS topology of the FAQ builder A topology diagram with three regions stacked vertically inside one AWS account boundary. Top region: ingress. Three boxes show the three intake lanes — a Drive sync via the drive-sync Lambda triggered every 15 minutes by EventBridge Scheduler that mirrors the FAQ doc, help docs, and chat transcripts to s3://fb-docs-source/, an SES inbound rule set with action S3 PUT to s3://fb-raw-mime/ plus the intake Lambda that reads the question, cleans it, and embeds it with Titan Text Embeddings V2, and a manual lane where a rep submits a question through the ack-handler Function URL that goes straight to cleaning and embedding. Middle region: scheduled processing. The grouper Lambda is triggered daily by EventBridge Scheduler; it reads the new question vectors, queries S3 Vectors for nearest neighbors, joins or starts clusters, applies the repeat threshold from s3://fb-rules-source/rules.txt, and writes cluster state to DynamoDB fb-clusters; for each candidate it invokes the drafter Lambda, which queries S3 Vectors over the help-doc chunks, calls Bedrock Claude Haiku 4.5 to draft a grounded answer with a citation, and writes the proposed entry to DynamoDB fb-proposals. Bottom region: review and publish. The dispatch path posts each proposal to Slack via chat.postMessage with Approve, Edit, and Reject buttons; button clicks land on a Function URL Lambda ack-handler that, on approve or edit, writes the entry to the live FAQ Google Doc via the Docs API, marks the cluster published in fb-clusters, and writes a row to fb-audit. CloudWatch Logs collects from every Lambda at 7-day retention. Across the right edge: a small box labelled AWS Budgets alarm at $15 monthly threshold, posting to SNS topic fb-cost-alarm. A note at the bottom: every answer is grounded and cited — and every interaction is logged to fb-audit. Ingress Lambda · drive-sync every 15 min Drive API → s3://fb-docs-source/ faq + help + chat SES inbound rule set fb-inbound-rules action: S3 PUT s3://fb-raw-mime/ trigger: intake Manual lane rep submits via ack-handler URL clean + Titan embed → question pile Question pile S3 Vectors · fb-clusters index Scheduled processing EventBridge Scheduler cron(0 18 * * ? *) in TZ_NAME target: grouper Lambda + drive-sync, digest Lambda · grouper queries S3 Vectors + rules.txt threshold joins clusters, flags candidates Lambda · drafter retrieve passages Haiku 4.5 grounded cite or refuse → fb-proposals Review & publish Lambda · dispatch posts proposal to Slack via chat.postMessage with buttons Slack interactive card with [Approve] [Edit] [Reject] button clicks → Function URL Lambda · ack-handler writes fb-clusters, fb-audit, and on approve writes the FAQ via Docs API Every answer is grounded and cited — and every interaction is logged to fb-audit.
Fig 7. AWS topology, in three regions of the diagram: ingress (three lanes into the question pile), scheduled processing (the daily grouper flagging candidates and the drafter writing grounded answers), review and publish (the proposal ships to Slack and the reviewer’s decision is recorded and published). Every Lambda is event- or schedule-driven; nothing is synchronous-chained.

Lambda functions

All Lambdas use the arm64 architecture, the smallest memory size that meets latency targets (typically 256 MB), Python 3.14 runtime, and CloudWatch Logs at 7-day retention. Each function has its own least-privilege IAM role. None run inside a VPC.

  • drive-sync — EventBridge Scheduler target, fires every 15 minutes. Uses the Google Drive API + Docs API (service-account credentials in Secrets Manager under fb/drive/sa) to export the live FAQ doc, the help docs, and any new chat transcripts and write them to s3://fb-docs-source/ only if changed since the last sync. The same pattern syncs the rules and voice docs to s3://fb-rules-source/. On a help-doc change, it enqueues the changed doc for re-chunking and re-embedding. Memory: 256 MB. Timeout: 30 s.
  • intake — S3 PUT trigger on s3://fb-raw-mime/ (support email) and on new transcripts under s3://fb-docs-source/chat/. Parses MIME or transcript, extracts the customer’s question(s), strips signatures/greetings and PII (regex plus a small allowlist), then calls Bedrock Titan Text Embeddings V2 (amazon.titan-embed-text-v2:0, 1024-dim) to embed each cleaned question and writes the vector to the fb-questions S3 Vectors index with metadata in fb-clusters. The manual lane reaches the same cleaning/embedding code path via ack-handler. Memory: 512 MB. Timeout: 60 s.
  • doc-embed — triggered by drive-sync when a help doc changes. Chunks the doc into short passages (heading-aware, ~500 tokens), embeds each with Titan V2, and upserts to the fb-helpdocs S3 Vectors index keyed by (doc_id, chunk_id) with the source section as metadata. Old chunks for that doc are deleted first so retrieval never returns stale passages. Memory: 512 MB. Timeout: 120 s.
  • grouper — EventBridge Scheduler target, daily at 6pm local (the schedule runs in TZ_NAME, e.g. Asia/Singapore). Reads the questions added since the last pass, queries fb-questions S3 Vectors for nearest neighbors, joins or starts clusters by the join_threshold distance, increments times-seen counters over the configured window, and applies min_asks_for_candidate from the rules doc. Writes cluster state to fb-clusters and invokes drafter per candidate. No Bedrock calls — the embedding already happened at intake. Memory: 512 MB. Timeout: 120 s.
  • drafter — invoked by grouper per candidate (or refresh). Embeds the cluster’s representative question, queries fb-helpdocs S3 Vectors for the top k passages, calls Bedrock Claude Haiku 4.5 (anthropic.claude-haiku-4-5-20251001-v1:0 via global.anthropic.claude-haiku-4-5-20251001-v1:0) with a grounded system prompt (answer from passages only; cite source; return NOT COVERED otherwise). Validates the citation maps to a retrieved passage; on failure routes the cluster to the human-write queue. Formats to the voice doc and writes the proposal to fb-proposals. For ambiguous, multi-part clusters where Haiku’s grounding is weak, it can escalate a single retry to Claude Sonnet 4.6 (global.anthropic.claude-sonnet-4-6-20250930-v1:0) — justified only because that retry is rare and a wrong public answer is expensive. Memory: 512 MB. Timeout: 60 s.
  • dispatch — triggered when a proposal is written. Posts the proposal to the configured Slack channel via chat.postMessage with Block Kit Approve/Edit/Reject buttons and the source link. Writes nothing to the FAQ; it only surfaces the proposal for review. Memory: 256 MB. Timeout: 30 s.
  • ack-handler — Lambda Function URL, public with AuthType: NONE; verifies a Slack signature on the request body. Triggered by Slack button clicks (Approve/Edit/Reject), by the Edit modal submission, and by the manual-lane question form. On approve or edit, writes the entry to the live FAQ Google Doc via the Docs API, marks the cluster published in fb-clusters, and adds it to the covered set. On reject, records the reason and marks the cluster rejected. Every action writes to fb-audit. Memory: 256 MB. Timeout: 15 s.
  • digest — EventBridge Scheduler target, weekly Friday 5pm. Reads fb-proposals and fb-clusters for the past week; posts a Slack summary of what published, what’s waiting in review, and which clusters hit the human-write queue (gaps in the docs). No Bedrock; a plain summary. Memory: 256 MB.

Storage

  • S3 Vectors · fb-questions — one vector per cleaned question. 1024-dim (Titan V2). Metadata: cluster_id, source (email/chat/manual), first_seen, seen_count. Used by the grouper’s nearest-neighbor search.
  • S3 Vectors · fb-helpdocs — one vector per help-doc chunk. 1024-dim. Metadata: doc_id, chunk_id, section, source_url. Used by the drafter’s retrieval. Re-embedded on doc change.
  • DynamoDB · fb-clusters — one row per cluster. PK cluster_id; attributes: centroid_ref, seen_count, window_counts, state (warm/candidate/published/rejected), covered_entry_id, priority. On-demand.
  • DynamoDB · fb-proposals — one row per drafted proposal. PK (cluster_id, draft_ts); attributes: question, answer, source_ref, model, status (pending/approved/edited/rejected). On-demand.
  • DynamoDB · fb-audit — one row per write action of any kind. PK (cluster_id, ts); attributes: action, by_user, before, after. On-demand. No TTL — this is the long-term audit trail.
  • S3 · fb-docs-source — mirrored FAQ doc, help docs, and chat transcripts. Versioning enabled. Lifecycle to Glacier at 90 days for transcripts; FAQ and help docs kept hot.
  • S3 · fb-rules-source — mirrored rules and voice docs as plain text. Versioning enabled.
  • S3 · fb-raw-mime — raw inbound MIME from the support inbox. Lifecycle to Glacier at 30 days; expiry at 1 year (questions are already extracted into the pile; the raw MIME is only kept for audit).

Bedrock

  • Embeddings. amazon.titan-embed-text-v2:0 (Amazon Titan Text Embeddings V2), 1024-dim. Two callsites: intake/doc-embed for the question and help-doc vectors. This is the dominant Bedrock cost — one embed per incoming question.
  • Drafting model. anthropic.claude-haiku-4-5-20251001-v1:0 via the Global cross-Region inference profile global.anthropic.claude-haiku-4-5-20251001-v1:0. One callsite: drafter, on candidate and refresh clusters only.
  • Escalation model. global.anthropic.claude-sonnet-4-6-20250930-v1:0 (Claude Sonnet 4.6), used only on a rare retry when Haiku’s grounded draft fails validation on a multi-part question. Capped at one retry per cluster per pass.
  • Quotas. Default account quotas are more than enough at SMB volume. The grouper itself doesn’t call Bedrock; embeddings batch the day’s questions, and the drafter fires a few times a week.

EventBridge Scheduler config

  • fb-daily-groupcron(0 18 * * ? *) in the SMB’s timezone. Target: grouper Lambda.
  • fb-drive-syncrate(15 minutes). Target: drive-sync Lambda.
  • fb-weekly-digestcron(0 17 ? * FRI *) in TZ. Target: digest Lambda.
  • Re-embed on demanddrive-sync invokes doc-embed directly when a help doc changes; no standing schedule needed.

SES inbound

  • Set the MX record on a dedicated subdomain (e.g. questions.your-company.com) to inbound-smtp.ap-southeast-1.amazonaws.com.
  • SES inbound rule set fb-inbound-rules: one rule with recipient questions@your-company.com → spam scan → S3 PUT to s3://fb-raw-mime/<message-id> → stop. The S3 PUT triggers intake.
  • No SES outbound identity is required — the builder delivers proposals to Slack, not email. If you want email proposals, verify a sender and add an ses:SendRawEmail path to dispatch.

IAM (least privilege per Lambda)

Each Lambda has its own role with policies scoped to exact ARNs. Sketch:

  • intake role: s3:GetObject on fb-raw-mime and the chat prefix; bedrock:InvokeModel on the Titan ARN; s3vectors:PutVectors on fb-questions; dynamodb:PutItem on fb-clusters. No access to the drafting model.
  • grouper role: s3vectors:QueryVectors on fb-questions; s3:GetObject on the rules doc; dynamodb:Query + UpdateItem on fb-clusters; lambda:InvokeFunction on drafter. No bedrock:*.
  • drafter role: s3vectors:QueryVectors on fb-helpdocs; bedrock:InvokeModel on the Haiku and Sonnet ARNs; s3:GetObject on the voice doc; dynamodb:PutItem on fb-proposals.
  • ack-handler role: dynamodb:PutItem on fb-clusters, fb-proposals, fb-audit; secretsmanager:GetSecretValue on the Docs-API service-account secret and the Slack signing secret; outbound network access to docs.googleapis.com and slack.com; bedrock:InvokeModel on the Titan ARN for the manual lane’s embed; s3vectors:PutVectors on fb-questions.
  • dispatch role: secretsmanager:GetSecretValue on the Slack bot token; outbound network access to slack.com; dynamodb:GetItem on fb-proposals.
  • drive-sync / doc-embed roles: secretsmanager:GetSecretValue on the Google service-account secret; s3:PutObject on the docs and rules buckets; bedrock:InvokeModel on Titan and s3vectors:PutVectors/DeleteVectors on fb-helpdocs (doc-embed only); outbound network to www.googleapis.com.

Slack interactive flow

Proposals are posted via the chat.postMessage Web API with Block Kit blocks containing the Approve/Edit/Reject buttons (the incoming webhook can’t carry interactive components). Button clicks are sent by Slack to the configured Interactivity request URL, which is the ack-handler Function URL. ack-handler verifies the Slack signing secret on the inbound request, parses the action_id (approve, edit, reject), opens a modal for Edit (pre-filled draft) and Reject (reason), and processes the response on modal submission. The manual-lane question form is a separate Block Kit modal opened from a slash command, submitting to the same Function URL.

The Slack app needs chat:write and the Interactivity URL configured. The bot token lives in Secrets Manager under fb/slack/bot-token; the signing secret under fb/slack/signing-secret.

Observability and cost gates

  • CloudWatch Logs: all Lambdas, 7-day retention, structured JSON. Subscription filter on "error" + "throttle" + "timeout" to a CloudWatch metric for alerting.
  • Alarms: grouper Lambda failures > 0 in a day (the daily pass is the one piece that has to run); drafter citation-validation failure rate > 30% in 24h (might mean the help docs drifted from how customers phrase questions); ack-handler signature-verification failures > 5/hour (might mean the Slack secret rotated).
  • X-Ray: off by default. Not worth the cost at SMB volume.
  • AWS Budgets: $15/month threshold, alarm at 80% and 100%, posts to SNS topic fb-cost-alarm subscribed to the on-call admin’s email and Slack.

Config and secrets

Service-account credentials for Drive and Docs APIs live in Secrets Manager under fb/drive/sa (one service account scoped to both). Slack bot token and signing secret under fb/slack/*. The join_threshold, min_asks_for_candidate, the count window, retrieval k, the configured timezone, and the review Slack channel all live in Parameter Store under /fb/config/. Lambdas fetch config on cold start and cache for the lifetime of the execution environment.

Deploy

GitHub Actions with OIDC into a deploy role (no long-lived keys) and AWS SAM for the stack. The opinionated bits: deploy the SES rule set as a separate stack (rule-set changes affect mail flow), turn on S3 versioning for fb-docs-source and fb-rules-source so a bad Drive edit can be rolled back in one click, and version the EventBridge Scheduler timezone setting so you don’t accidentally start running the daily pass in UTC after a CI rotation. Keep the two S3 Vectors indexes (fb-questions, fb-helpdocs) in their own stack so a re-index never risks the rest. Total deployable surface: around eight Lambdas, three DDB tables, two S3 Vectors indexes, three S3 buckets, the Scheduler rules, one SES rule set, and one Budgets alarm.

That’s the full system. Six narrative posts and this engineering reference. If you want to talk about adapting it for your business, see Work with me.

All posts