Engineering reference: the social inbox unifier architecture
Same system, drawn for engineers. Region, service names, resource identifiers, Bedrock model IDs, Lambda inventory, IAM scopes, the platform webhook setup, the SQS config, the DynamoDB schemas, and the reply flow. Read alongside the previous six posts; this one’s the build sheet.
Region and account shape
Default region: ap-southeast-1 (Singapore). Bedrock cross-Region inference, SQS, Lambda Function URLs, and EventBridge Scheduler are all in good shape there. A second region for multi-region resilience isn’t worth the extra setup work at SMB volume — the failure mode for an SMB is a delayed reply, not a regional outage. One AWS account dedicated to the unifier (separate from your other workloads) keeps the IAM blast radius small and lets a single AWS Budgets alarm cover the whole system.
Topology
Lambda functions
All Lambdas use the arm64 architecture, the smallest memory size that meets latency targets (typically 256 MB), Python 3.14 runtime, and CloudWatch Logs at 7-day retention. Each function has its own least-privilege IAM role. None run inside a VPC.
ig-connector/fb-connector/wa-connector— one Lambda per platform, each fronted by a Lambda Function URL (AuthType: NONE; signature verified in-handler). On the platform’s webhook call, verifies the platform signature against the secret in Secrets Manager (Instagram/Facebook use theX-Hub-Signature-256HMAC; WhatsApp uses its app-secret HMAC), parses the platform payload, normalizes to the common message shape, andSendMessagetosiu-intake-queue. Returns 200 immediately so the platform doesn’t re-deliver. Each also answers the platform’s GET verification handshake on setup. Memory: 256 MB. Timeout: 10 s.labeler— SQS batch trigger onsiu-intake-queue(batch size 10, partial-batch-response enabled). For each record, computesfingerprint = sha256(sender|platform|normalized_text)and does a conditionalPutItemonsiu-dedupe; if the item already exists, links the message to the live thread and skips. For new messages, invokes Bedrock Haiku 4.5 (anthropic.claude-haiku-4-5-20251001-v1:0viaglobal.anthropic.claude-haiku-4-5-20251001-v1:0) with a strict JSON-only prompt returningtopic,urgency,language,summary. Writes/updates the thread insiu-threads, then invokesrouter(or emits to it via the same handler). Memory: 512 MB. Timeout: 30 s.router— resolves the owning team from the topic→team map in/siu/config/routing(Parameter Store), filters candidate teammates by working hours and status fromsiu-roster, picks the least-loaded eligible teammate fromsiu-loadcount, attaches thread/labels/customer-history context, and writes the assignment tosiu-assignmentswith an atomic increment of the teammate’s open-thread count. Urgent messages bypass the round and go straight to the least-loaded eligible teammate. No Bedrock. Memory: 256 MB. Timeout: 15 s.reply-handler— Lambda Function URL, public withAuthType: NONE; verifies the shared-inbox session token. Triggered by teammate actions (Send reply / Reassign / Close). On send, posts the human-written reply via the originating platform’s send API (IG/FB Send API, WhatsApp Cloud API; tokens in Secrets Manager), appends it tosiu-threads, and writesaction: repliedtosiu-audit. On reassign, updatessiu-assignmentsand both teammates’ counts. On close, marks the thread closed and decrements the count. Memory: 256 MB. Timeout: 15 s. Never composes reply text — the body comes from the teammate.reopen-watcher— invoked from thelabelerpath when an incoming fingerprint links to a closed thread; flips the thread back to open, re-runs routing, and writesaction: reopenedtosiu-audit. Lightweight; folded into the labeler in small deployments. Memory: 256 MB.sla-sweeper— EventBridge Scheduler target, every 5 minutes. Scanssiu-assignmentsfor threads past their urgency target (urgent: minutes; normal: hours; from/siu/config/sla); re-routes stale threads to the topic’s backup teammate and posts a notice to the team Slack channel. No reply is sent. Memory: 256 MB. Timeout: 30 s.digest— EventBridge Scheduler target, daily at 6pm local. Readssiu-threadsandsiu-auditfor the day; sends a summary (messages in, answered, still open, anything that breached its target) to a configured Slack channel. No Bedrock; the message is a plain summary table. Memory: 256 MB.
Storage
- SQS ·
siu-intake-queue— the work queue holding normalized messages. Visibility timeout 60 s; redrive policy tosiu-intake-dlqafter 5 receives. The DLQ has a CloudWatch alarm on depth > 0. - DynamoDB ·
siu-dedupe— PKfingerprint; attributes:thread_id,first_seen. TTL of a few days so old fingerprints expire and a genuinely new message next month isn’t treated as a repeat. On-demand. - DynamoDB ·
siu-threads— PKthread_id; attributes:customer_id,platform,status(open/closed),topic,urgency,language,messages(ordered list),assignee. On-demand. GSI on(customer_id)for history lookups. - DynamoDB ·
siu-assignments— PKthread_id; sort keyassigned_at; attributes:assignee,team,reason,sla_due. GSI on(assignee, status)to render each teammate’s queue. On-demand. - DynamoDB ·
siu-loadcount— PKassignee; attribute:open_threads(atomic counter). On-demand. - DynamoDB ·
siu-roster— PKassignee; attributes:topics,working_hours,status,backup. On-demand. Editable from the inbox admin screen. - DynamoDB ·
siu-audit— one row per write action of any kind. PK(thread_id, ts); attributes:action,by_user,before,after,notes. On-demand. No TTL — this is the long-term audit trail. - S3 ·
siu-message-snapshots— raw normalized message snapshots and any inbound attachments (images, voice notes). Versioning enabled. Lifecycle to Glacier at 90 days; expiry at 2 years.
Bedrock
- Foundation model.
anthropic.claude-haiku-4-5-20251001-v1:0via the Global cross-Region inference profileglobal.anthropic.claude-haiku-4-5-20251001-v1:0. One callsite:labeler, for the per-message label set. No callsite ever generates customer-facing text. - Heavier model. Not used on the hot path. Claude Sonnet 4.6 (
anthropic.claude-sonnet-4-6-...) is wired only as an optional offline reviewer for tuning the topic taxonomy from a sample of mislabeled threads — run by hand, not in production traffic. - Embeddings. Not used. The system labels and routes; it does not answer from documents. No Knowledge Base, no S3 Vectors.
- Quotas. Default account quotas are more than enough at SMB volume. One small call per new message; duplicates never reach the model.
Platform webhooks
- Instagram / Facebook. App subscribed to the
messageswebhook field on the Page/IG account. Callback URL is the connector’s Function URL; verify token stored in Secrets Manager for the GET handshake; payloads HMAC-signed with the app secret (X-Hub-Signature-256), verified in-handler. - WhatsApp. Cloud API webhook subscribed to
messages; callback URL is thewa-connectorFunction URL; app-secret HMAC verified in-handler; phone-number ID and access token in Secrets Manager. - Reply path. Outbound uses each platform’s send API with the same stored tokens. Note WhatsApp’s 24-hour customer-care window: replies outside it require an approved template, which the inbox surfaces to the teammate rather than silently failing.
IAM (least privilege per Lambda)
Each Lambda has its own role with policies scoped to exact ARNs. Sketch:
- connector roles:
sqs:SendMessageonsiu-intake-queue;secretsmanager:GetSecretValueon that platform’s secret only. No DynamoDB, nobedrock:*. - labeler role:
sqs:ReceiveMessage+DeleteMessageon the queue;dynamodb:PutItem(conditional) onsiu-dedupe;dynamodb:PutItem+UpdateItemonsiu-threads;bedrock:InvokeModelon the Haiku ARN;lambda:InvokeFunctiononrouter. - router role:
dynamodb:GetItemonsiu-rosterandsiu-loadcount;dynamodb:PutItem+UpdateItemonsiu-assignmentsandsiu-loadcount;ssm:GetParameteron/siu/config/*. Nobedrock:*. - reply-handler role:
dynamodb:UpdateItemonsiu-threads,siu-assignments,siu-loadcount;dynamodb:PutItemonsiu-audit;secretsmanager:GetSecretValueon the platform send-token secrets; outbound network tograph.facebook.comand the WhatsApp Cloud API host. - sla-sweeper / digest roles:
dynamodb:Queryonsiu-assignments/siu-threads;secretsmanager:GetSecretValueon the Slack token;ssm:GetParameteron/siu/config/sla.
SQS and EventBridge Scheduler config
siu-intake-queue— standard queue; visibility timeout 60 s;maxReceiveCount5 tosiu-intake-dlq. Labeler event-source mapping batch size 10, max batching window 5 s, partial-batch-response on.siu-sla-sweep—rate(5 minutes). Target:sla-sweeperLambda.siu-daily-digest—cron(0 18 * * ? *)inTZ_NAME. Target:digestLambda.- One-off rules — not needed on the hot path; re-routes are handled by the periodic sweeper rather than per-thread schedules, which keeps the Scheduler surface tiny.
Observability and cost gates
- CloudWatch Logs: all Lambdas, 7-day retention, structured JSON. Subscription filter on
"error"+"throttle"+"timeout"to a CloudWatch metric for alerting. - Alarms:
siu-intake-dlqdepth > 0 (a message failed to process); labeler error rate > 1% in 24h; connector signature-verification failures > 5/hour (might mean a platform secret rotated); reply-handler send failures > 0 (a reply didn’t reach the customer). - X-Ray: off by default. Not worth the cost at SMB volume.
- AWS Budgets: $20/month threshold, alarm at 80% and 100%, posts to SNS topic
siu-cost-alarmsubscribed to the on-call admin’s email and Slack.
Config and secrets
Platform credentials live in Secrets Manager: siu/ig/*, siu/fb/*, siu/wa/* (app secret, verify token, page/phone-number ID, send token per platform). The Slack token for digests and SLA notices is under siu/slack/bot-token. The topic→team routing map, the urgency SLA targets, quiet/working-hours defaults, and the dedupe window all live in Parameter Store under /siu/config/. Lambdas fetch config on cold start and cache for the lifetime of the execution environment.
Deploy
GitHub Actions with OIDC into a deploy role (no long-lived keys) and AWS SAM. The opinionated bits: deploy the connector Function URLs and their secrets as a separate stack (a rotated platform secret shouldn’t force a full redeploy), turn on S3 versioning for siu-message-snapshots, set the dedupe TTL deliberately (too short re-opens duplicates, too long can swallow a real new message), and keep the topic taxonomy in Parameter Store so changing it never needs a deploy. Total deployable surface: around eight Lambdas, one SQS queue plus its DLQ, seven DynamoDB tables, one S3 bucket, two EventBridge Scheduler rules, and one Budgets alarm.
That’s the full system. Six narrative posts and this engineering reference. If you want to talk about adapting it for your business, see Work with me.
All posts