Part 7 of 7 · Content moderator series ~8 min read

Engineering reference: the content moderator architecture

Same system, drawn for engineers. Region, service names, resource identifiers, Bedrock model IDs, Lambda inventory, IAM scopes, the SQS review queue, the EventBridge config, the DynamoDB schemas, and the Slack interactive flow. Read alongside the previous six posts; this one’s the build sheet.

Region and account shape

Default region: ap-southeast-1 (Singapore). Lambda Function URLs, Bedrock Global cross-Region inference, SQS, and EventBridge are all available there. A second region for resilience isn’t worth the extra setup at SMB volume — the failure mode for an SMB is a flagged comment waiting an extra hour for review, not a regional outage. One AWS account dedicated to the moderator (separate from your other workloads) keeps the IAM blast radius small and lets one AWS Budgets alarm cover the whole system.

Topology

AWS topology of the content moderator A topology diagram with three regions stacked vertically inside one AWS account boundary. Top region: ingress. Three boxes show the intake. A webhook-in Lambda exposed as a Function URL verifies the platform signature on each inbound comment, review, or post and writes the raw payload to s3://cm-raw/; an intake Lambda triggered by that S3 PUT cleans and normalizes the text, writes one record to DynamoDB cm-items, and runs the deterministic rule pass against the house-rules lists mirrored to S3; a drive-sync Lambda triggered every 15 minutes by EventBridge Scheduler mirrors the rules and voice docs from Google Drive to s3://cm-rules-source/. Middle region: checking. The checker Lambda is invoked for borderline items only; it reads s3://cm-rules-source/rules.txt plus the worked-examples set from DynamoDB, calls Bedrock Haiku 4.5 for a verdict with confidence and a cited rule, and emits one of the sending events to the EventBridge default bus per item that needs an action: cm.hold, cm.send_to_human, or cm.hold_notify. Pass items emit nothing and publish. Bottom region: review and decision. The dispatch Lambda is triggered by an EventBridge rule on those events; it resolves the reviewer, checks quiet hours and groups repeat offenders, builds the review card from s3://cm-rules-source/voice.txt, and puts it in front of a moderator via Slack Block Kit or SES outbound email, writing the card to the SQS cm-review-queue with a dead-letter queue behind it. Slack button clicks land on a Function URL Lambda action-handler that publishes, removes, or edits the item via the platform API and writes the decision to cm-audit, appending overturns to the worked-examples set. CloudWatch Logs collects from every Lambda at 7-day retention. Across the right edge: a small box labelled AWS Budgets alarm at $25 monthly threshold, posting to SNS topic cm-cost-alarm. A note at the bottom: nothing is auto-deleted — and every decision is logged to cm-audit. Ingress Lambda · webhook-in Function URL verify signature → s3://cm-raw/ <item-id>.json Lambda · intake S3 PUT trigger clean, write cm-items rule pass (no model) borderline → checker Lambda · drive-sync every 15 min Drive API → s3://cm-rules-source/ rules.txt + voice.txt DynamoDB cm-items one record · rule-pass result Checking Borderline items only async invoke from intake target: checker Lambda (pass skips it) Lambda · checker reads rules.txt from S3 + worked examples Bedrock Haiku 4.5, verdict + confidence EventBridge default bus cm.hold cm.send_to_human cm.hold_notify (pass → no event) Review & decision Lambda · dispatch resolves reviewer, quiet hours, grouping; Slack Block Kit or SES → cm-review-queue Slack interactive card with [Publish] [Remove] [Edit] button clicks → Function URL Lambda · action-handler publish / remove / edit via platform API, writes cm-audit + worked examples Nothing is auto-deleted — and every decision is logged to cm-audit.
Fig 7. AWS topology, in three regions of the diagram: ingress (webhook in, clean, rule pass), checking (the borderline middle gets a model verdict and emits events), review and decision (the card reaches a moderator and the human’s call is recorded). Every Lambda is event- or schedule-driven; nothing is synchronous-chained.

Lambda functions

All Lambdas use the arm64 architecture, the smallest memory size that meets latency targets (typically 256 MB), Python 3.14 runtime, and CloudWatch Logs at 7-day retention. Each function has its own least-privilege IAM role. None run inside a VPC.

  • webhook-in — Lambda Function URL, AuthType: NONE; verifies each platform’s HMAC signature (secrets per platform in Secrets Manager under cm/webhook/<platform>) before doing anything. Writes the raw payload to s3://cm-raw/<item-id>.json and returns 200 fast so the platform doesn’t retry. All real work is deferred to the S3 PUT trigger. Memory: 256 MB. Timeout: 10 s.
  • intake — S3 PUT trigger on s3://cm-raw/. Strips HTML, normalizes text, extracts author/links/length/area, and upserts one record to cm-items keyed by item_id (idempotent on platform retries). Runs the deterministic rule pass against the allow list, banned-word list, and blocked-domain list loaded from s3://cm-rules-source/. On pass it marks the item published; on hold it emits cm.hold; on borderline it async-invokes checker. Memory: 256 MB. Timeout: 30 s. No Bedrock calls.
  • checker — async-invoked by intake for borderline items only. Reads rules.txt from s3://cm-rules-source/ and the worked-examples set for the area from cm-examples. Calls Bedrock Haiku 4.5 (anthropic.claude-haiku-4-5-20251001-v1:0 via global.anthropic.claude-haiku-4-5-20251001-v1:0) with a strict JSON-only contract: {verdict, confidence, rule}. Applies the per-area confidence threshold from the rules doc, then emits cm.hold, cm.send_to_human, or cm.hold_notify — or marks the item published on a confident pass. Memory: 512 MB. Timeout: 30 s.
  • dispatch — EventBridge rule on the three move events. Resolves the reviewer (per-area, then admin fallback), checks quiet hours, groups repeat offenders by author, formats the card from voice.txt, and sends via the Slack chat.postMessage Web API (Block Kit) or SES SendRawEmail. Enqueues the card reference to the cm-review-queue SQS queue and writes a cm-queue row so a re-drive won’t double-send. A cm.hold_notify bypasses quiet-hours batching. Memory: 256 MB. Timeout: 30 s.
  • action-handler — Lambda Function URL, public with AuthType: NONE; verifies the Slack signing secret on the request body. Triggered by Slack interactive button clicks (Publish/Remove/Edit) and by email-link clicks. Calls the originating platform’s API to publish, remove, or post an edited version; writes the decision to cm-audit; on an overturn, appends a worked example to cm-examples; clears the cm-queue entry. Memory: 256 MB. Timeout: 15 s.
  • drive-sync — EventBridge Scheduler target, every 15 minutes. Uses the Google Drive API (service-account credentials in Secrets Manager under cm/drive/sa) to export the rules and voice docs as plain text and write to s3://cm-rules-source/ only if they changed since the last sync. Memory: 256 MB. Timeout: 30 s.
  • digest — EventBridge Scheduler target, weekly Monday 9am in TZ_NAME. Reads the past week’s cm-audit and cm-queue; calls Bedrock Haiku 4.5 once to write a short narrative summarizing holds, removals, overturns, and any dead-letter items; emails it via SES to the configured stakeholder list and posts a summary to a configured Slack channel. Memory: 512 MB. Timeout: 60 s.

Storage

  • DynamoDB · cm-items — one row per item. PK item_id; attributes: platform, area, author, text, links, rule_pass, state (published/held/removed/edited), raw_s3_key. On-demand.
  • DynamoDB · cm-queue — one row per dispatched card. PK (item_id, card_id); attributes: reviewer, sent_via (slack/email), verdict, rule, group_key. On-demand. Marks that a card was sent so re-drives don’t duplicate.
  • DynamoDB · cm-audit — one row per write action of any kind. PK (item_id, ts); attributes: action (publish/remove/edit), by_user, rule, before, after. On-demand. No TTL — this is the long-term audit trail.
  • DynamoDB · cm-examples — curated worked examples from moderator overturns. PK area; sort key ts; attributes: text, system_verdict, human_decision, rule. Capped per area (most recent N) by a small compaction step in digest. On-demand.
  • S3 · cm-raw — raw inbound webhook payloads. Versioning enabled. Lifecycle to Glacier at 30 days; expiry at 2 years.
  • S3 · cm-rules-source — mirrored rules and voice docs as plain text. Versioning enabled, so a bad Drive edit rolls back in one click.
  • S3 · cm-originals — originals of edited items, kept so any edit-and-publish is reversible and auditable.

Bedrock

  • Foundation model. anthropic.claude-haiku-4-5-20251001-v1:0 via the Global cross-Region inference profile global.anthropic.claude-haiku-4-5-20251001-v1:0. Two callsites: checker for the borderline verdict, and digest for the weekly narrative. A heavier reasoning path on Claude Sonnet 4.6 isn’t justified here — the verdict is a short, well-scoped classification, and Haiku 4.5 with worked examples handles it cheaply.
  • Embeddings. Not used. The house rules are a short doc fed straight into the prompt; deterministic lists plus a few worked examples beat vector retrieval at this scale. No Knowledge Base, no S3 Vectors. (If a customer’s rules ever grew past a single prompt, Amazon Titan Text Embeddings V2 at 1024 dimensions into Amazon S3 Vectors would be the path — not needed at SMB volume.)
  • Quotas. Default account quotas are more than enough at SMB volume. The rule pass keeps most items off Bedrock entirely.

SQS review queue

  • cm-review-queue — standard SQS queue holding card references awaiting a moderator. Visibility timeout 5 min; the dispatch path is the producer, the Slack/SES send is the consumer side.
  • cm-review-dlq — dead-letter queue, maxReceiveCount: 5. Anything that fails to send repeatedly lands here instead of looping or vanishing; the weekly digest reads and reports DLQ depth.
  • Grouping — the dispatch computes a group_key of (author, rule) and folds new items for an existing open key into one card, so a burst of identical spam is one review, not fifty.

EventBridge config

  • cm-move-rule — rule on the default bus matching cm.hold, cm.send_to_human, cm.hold_notify. Target: dispatch Lambda.
  • cm-drive-sync — Scheduler rate(15 minutes). Target: drive-sync Lambda.
  • cm-weekly-digest — Scheduler cron(0 9 ? * 2#1 *) (Monday 9am, weekly cadence) in TZ_NAME. Target: digest Lambda.
  • Notify path — a cm.hold_notify event is matched by the same cm-move-rule; dispatch reads the event detail and skips quiet-hours batching for it.

Platform webhooks and APIs

  • Each platform (community page, comment plugin, review source) is configured to POST new-content webhooks to the webhook-in Function URL with a shared signing secret.
  • Per-platform API credentials for the publish/remove/edit calls live in Secrets Manager under cm/platform/<platform>. The action-handler dispatches to the right client by the item’s platform attribute.
  • Platforms that don’t support editing a member’s content have the Edit button suppressed at card-compose time; only Publish and Remove are shown.

IAM (least privilege per Lambda)

Each Lambda has its own role with policies scoped to exact ARNs. Sketch:

  • webhook-in role: s3:PutObject on cm-raw; secretsmanager:GetSecretValue on the per-platform webhook secrets. Nothing else.
  • intake role: s3:GetObject on cm-raw and cm-rules-source; dynamodb:PutItem on cm-items; events:PutEvents on the default bus; lambda:InvokeFunction on checker. No bedrock:*.
  • checker role: s3:GetObject on cm-rules-source; dynamodb:Query on cm-examples; bedrock:InvokeModel on the Haiku ARN; events:PutEvents on the default bus.
  • dispatch role: sqs:SendMessage on cm-review-queue; secretsmanager:GetSecretValue on the Slack bot token; ses:SendRawEmail from the verified sender identity; dynamodb:PutItem + Query on cm-queue; outbound network to slack.com.
  • action-handler role: dynamodb:PutItem on cm-audit and cm-examples; dynamodb:UpdateItem on cm-items; secretsmanager:GetSecretValue on the Slack signing secret and the per-platform API secrets; s3:PutObject on cm-originals; outbound network to the platform API hosts.
  • drive-sync and digest roles: drive-sync gets secretsmanager:GetSecretValue on the Google service-account secret and s3:PutObject on cm-rules-source; digest gets dynamodb:Query on cm-audit/cm-queue, sqs:GetQueueAttributes on the DLQ, bedrock:InvokeModel on the Haiku ARN, and ses:SendRawEmail.

Slack interactive flow

Cards are posted via the chat.postMessage Web API with Block Kit blocks containing the action buttons (Publish/Remove/Edit). Button clicks are sent by Slack to the configured Interactivity request URL, which is the action-handler Function URL. action-handler verifies the Slack signing secret, parses the action_id (publish, remove, edit), opens a modal for Edit, and processes the decision on submit. Publish and Remove are one-tap.

The Slack app needs chat:write and im:write, plus the Interactivity URL configured. The bot token lives in Secrets Manager under cm/slack/bot-token; the signing secret is cm/slack/signing-secret.

Observability and cost gates

  • CloudWatch Logs: all Lambdas, 7-day retention, structured JSON. Subscription filter on "error" + "throttle" + "timeout" to a metric for alerting.
  • Alarms: webhook-in 5xx rate > 1% in 5 min (dropped inbound content is the worst failure); cm-review-dlq depth > 0; action-handler signature-verification failures > 5/hour (might mean the Slack secret rotated).
  • X-Ray: off by default. Not worth the cost at SMB volume.
  • AWS Budgets: $25/month threshold, alarm at 80% and 100%, posts to SNS topic cm-cost-alarm subscribed to the on-call admin’s email and Slack.

Config and secrets

Google service-account credentials for the Drive API live in Secrets Manager under cm/drive/sa. Slack bot token and signing secret under cm/slack/*. Per-platform webhook signing secrets under cm/webhook/* and per-platform API credentials under cm/platform/*. The configured timezone, quiet-hours window, per-area confidence thresholds, and admin fallback reviewer live in Parameter Store under /cm/config/. Lambdas fetch config on cold start and cache for the lifetime of the execution environment.

Deploy

GitHub Actions with OIDC into a deploy role — no long-lived AWS keys — running AWS SAM to ship the stack. The opinionated bits: turn on S3 versioning for cm-raw, cm-rules-source, and cm-originals; keep the dead-letter queue and its alarm in the same stack as the review queue so they ship together; and pin the EventBridge Scheduler timezone so the weekly digest doesn’t silently start running in UTC after a CI rotation. Total deployable surface: seven Lambdas, four DDB tables, three S3 buckets, one SQS queue plus its DLQ, one EventBridge rule on the default bus (plus the Scheduler rules), and one Budgets alarm.

That’s the full system. Six narrative posts and this engineering reference. If you want to talk about adapting it for your business, see Work with me.

All posts