Engineering reference: the expiry watcher architecture
Same system, drawn for engineers. Region, service names, resource identifiers, Bedrock model IDs, Lambda inventory, IAM scopes, the SES inbound rule set, EventBridge Scheduler config, the DynamoDB schemas, and the Slack interactive flow. Read alongside the previous six posts; this one’s the build sheet.
Region and account shape
Default region: us-east-1. SES inbound, Bedrock cross-Region inference, and EventBridge Scheduler are all in good shape there. A second region for multi-region resilience isn’t worth the extra setup work at SMB volume — the failure mode for an SMB is somebody missing a renewal alert, not a regional outage. One AWS account dedicated to the watcher (separate from your other workloads) keeps the IAM blast radius small and lets a single AWS Budgets alarm cover the whole system.
Topology
Lambda functions
All Lambdas use the arm64 architecture, the smallest memory size that meets latency targets (typically 256 MB), Python 3.14 runtime, and CloudWatch Logs at 7-day retention. Each function has its own least-privilege IAM role. None run inside a VPC.
drive-sync— EventBridge Scheduler target, fires every 15 minutes. Uses the Google Drive API + Sheets API (service-account credentials in Secrets Manager underew/drive/sa) to export the registry sheet as CSV and write tos3://ew-registry-source/registry.csvonly if the sheet has changed since the last sync. Same pattern syncs the rules and voice docs tos3://ew-rules-source/. Memory: 256 MB. Timeout: 30 s.calendar-sync— EventBridge Scheduler target, hourly. Uses the Google Calendar APIevents.listto scan configured calendars for events with#expiresin the description; for any new events, creates a Slack interactive proposal message. For lower-latency setups you can switch toevents.watchand have Calendar push notifications to a Function URL instead of polling, at the cost of renewing the channel before it expires (Calendar push channels have a finite TTL and need a small refresh job). Memory: 256 MB. Timeout: 30 s.intake-ses-parser— S3 PUT trigger ons3://ew-raw-mime/. Parses MIME, extracts the PDF attachment, runs Textract viaStartDocumentTextDetection+StartDocumentAnalysis(asynchronously to handle multi-page contracts). On Textract completion (via SNS notification), reads the structured text and calls Bedrock Haiku 4.5 (anthropic.claude-haiku-4-5-20251001-v1:0viaglobal.anthropic.claude-haiku-4-5-20251001-v1:0) to propose a registry row. Posts the proposal to Slack via the incoming webhook with Approve/Edit/Discard buttons. For DOCX attachments (Textract doesn’t accept them), falls back topython-docx; XLSX usesopenpyxl. Both packages are stable and widely used in 2026, though their maintenance velocity is light — for a contract-parsing path that only runs a few times a month, that’s acceptable. If extraction precision becomes a concern, the active community forkpython-docx-ossis a drop-in alternative. Memory: 512 MB. Timeout: 60 s.watcher— EventBridge Scheduler target, daily at 8am local time (the schedule expression runs inTZ_NAMEset to the SMB’s timezone, e.g.America/New_York). Readss3://ew-registry-source/registry.csvand the rules and voice docs. For each row, computesdays_to_expiry, reads chain state fromew-pingsandew-ack, decides on a move. Emits one event per row that needs action:ew.first_alert,ew.reminder, orew.escalate, with the item context as the event payload. Healthy items emit nothing. Memory: 512 MB. Timeout: 60 s. No Bedrock calls.dispatch— EventBridge rule on the three move events. Resolves owner, checks quiet hours and holiday calendar, formats the alert from the voice template, and ships via Slack incoming webhook (ew/slack/webhookin Secrets Manager) or SESSendRawEmail. On quiet-hours or holiday defer, creates a one-off EventBridge Scheduler rule that re-invokesdispatchat the next available business minute. Writes a row toew-pingsafter a successful send. Memory: 256 MB. Timeout: 30 s.ack-handler— Lambda Function URL, public withAuthType: NONE; verifies a Slack signature on the request body. Triggered by Slack interactive button clicks (Renew/Snooze/Ack-only) and by email-link clicks. Writes toew-ackandew-audit; on renew, updates the Drive sheet via the Sheets API and archives the old chain inew-pings-archive. Memory: 256 MB. Timeout: 15 s.digest— EventBridge Scheduler target, weekly Sunday 6pm. Readsew-pingsfor the past week and the registry; sends a digest message to a configured Slack channel summarizing pings sent and items coming up. No Bedrock; the message is a plain summary table. Memory: 256 MB.summary— EventBridge Scheduler target, monthly on the first Monday at 9am. Reads the past month’sew-pings,ew-ack, andew-audit; calls Bedrock Haiku 4.5 to write a one-paragraph board narrative; emails it via SES to the configured stakeholder list. Memory: 512 MB.
Storage
- DynamoDB ·
ew-pings— one row per dispatch. PK(item_id, chain_index); attributes:ping_date,dispatched_via(slack/email),recipient,move(first_alert/reminder/escalate). On-demand. No TTL. - DynamoDB ·
ew-ack— one row per acknowledgment. PKitem_id; sort keyack_date; attributes:action(renew/snooze/ack-only),by_user,snooze_until(if action = snooze),old_expiry,new_expiry(if action = renew). On-demand. - DynamoDB ·
ew-audit— one row per write action of any kind. PK(item_id, ts); attributes:action,by_user,before,after. On-demand. No TTL — this is the long-term audit trail. - DynamoDB ·
ew-pings-archive— archived chains after a renewal. Same shape asew-pings; PK(item_id, chain_id, chain_index). On-demand. - S3 ·
ew-registry-source— mirrored CSV from the Drive registry sheet. Versioning enabled. Lifecycle to Glacier at 90 days; expiry at 7 years. - S3 ·
ew-rules-source— mirrored rules and voice docs as plain text. Versioning enabled. - S3 ·
ew-raw-mime— raw inbound MIME from forwarded contracts. Lifecycle to Glacier at 30 days; expiry at 7 years. - S3 ·
ew-source-pdfs— the parsed source contracts after the inbound parser handles them, kept for reference if the registry row links to one.
Bedrock
- Foundation model.
anthropic.claude-haiku-4-5-20251001-v1:0via the Global cross-Region inference profileglobal.anthropic.claude-haiku-4-5-20251001-v1:0. Two callsites:intake-ses-parserfor the inbound contract parsing, andsummaryfor the monthly board narrative. - Embeddings. Not used. The registry is structured rows; deterministic lookup beats vector retrieval here. No Knowledge Base, no S3 Vectors.
- Quotas. Default account quotas are more than enough at SMB volume. The watcher itself doesn’t call Bedrock; the parsing lane fires a few times a month at most.
EventBridge Scheduler config
ew-daily-tick—cron(0 8 * * ? *)in the SMB’s timezone. Target:watcherLambda.ew-drive-sync—rate(15 minutes). Target:drive-syncLambda.ew-calendar-sync—rate(1 hour). Target:calendar-syncLambda.ew-weekly-digest—cron(0 18 ? * SUN *)in TZ. Target:digestLambda.ew-monthly-summary—cron(0 9 ? * 2#1 *)(first Monday at 9am) in TZ. Target:summaryLambda.- One-off rules — created on the fly by
dispatchwhen a quiet-hours or holiday defer is needed. Useat(YYYY-MM-DDTHH:MM:SS)expressions with--action-after-completion DELETEso the rule self-cleans.
SES inbound and outbound
- Set the MX record on a dedicated subdomain (e.g.
expires.your-company.com) toinbound-smtp.us-east-1.amazonaws.com. - SES inbound rule set
ew-inbound-rules: one rule with recipientexpires@your-company.com→ spam scan → S3 PUT tos3://ew-raw-mime/<message-id>→ stop. The S3 PUT triggersintake-ses-parser. - SES outbound for the email-fallback alerts: verify a sender identity at
watcher@your-company.comwith DKIM and SPF on the parent domain. Out of sandbox by request.
IAM (least privilege per Lambda)
Each Lambda has its own role with policies scoped to exact ARNs. Sketch:
- watcher role:
s3:GetObjecton the registry, rules, and voice keys;dynamodb:Query+GetItemonew-pings,ew-ack;events:PutEventson the default bus. Nobedrock:*. - dispatch role:
events:ListSchedules+CreateSchedulefor the deferred-dispatch one-offs;secretsmanager:GetSecretValueon the Slack webhook secret;ses:SendRawEmailfrom the verified sender identity;dynamodb:PutItemonew-pings; outbound network access tohooks.slack.com. - ack-handler role:
dynamodb:PutItemonew-ackandew-audit;secretsmanager:GetSecretValueon the Sheets-API service-account secret; outbound network access tosheets.googleapis.com;dynamodb:Queryfor chain state lookup; on renew,dynamodb:BatchWriteItemfor archiving the old chain toew-pings-archive. - intake-ses-parser role:
s3:GetObjectonew-raw-mime;textract:StartDocumentTextDetection+StartDocumentAnalysis;bedrock:InvokeModelon the Haiku ARN;secretsmanager:GetSecretValueon the Slack webhook. - drive-sync and calendar-sync roles:
secretsmanager:GetSecretValueon the Google service-account secret;s3:PutObjecton the registry and rules buckets; outbound network towww.googleapis.com.
Slack interactive flow
The Slack incoming webhook is the simplest delivery surface but doesn’t support interactive button responses. So the alert messages are posted via the chat.postMessage Web API instead, with Block Kit blocks containing the action buttons. Button clicks are sent by Slack to the configured Interactivity request URL, which is the ack-handler Function URL. ack-handler verifies the Slack signing secret on the inbound request, parses the action_id (renew, snooze, ack_only), opens a modal if needed (Renew/Snooze open modals; Ack-only is one-tap), and processes the response when the modal is submitted.
The Slack app needs chat:write, im:write, and the Interactivity URL configured. The bot token lives in Secrets Manager under ew/slack/bot-token. The signing secret is ew/slack/signing-secret.
Observability and cost gates
- CloudWatch Logs: all Lambdas, 7-day retention, structured JSON. Subscription filter on
"error"+"throttle"+"timeout"to a CloudWatch metric for alerting. - Alarms: watcher Lambda failures > 0 in a day (the daily tick is the one piece that has to run); dispatch failure rate > 1% in 24h; ack-handler signature-verification failures > 5/hour (might mean the Slack secret rotated).
- X-Ray: off by default. Not worth the cost at SMB volume.
- AWS Budgets: $15/month threshold, alarm at 80% and 100%, posts to SNS topic
ew-cost-alarmsubscribed to the on-call admin’s email and Slack.
Config and secrets
Service-account credentials for Drive, Sheets, and Calendar APIs all live in Secrets Manager under ew/drive/sa (one service account with scopes for all three APIs). Slack bot token, signing secret, and webhook URL all under ew/slack/*. SES sender identity lives in IAM and the verified-domain config. The configured timezone, holiday list reference, quiet-hours window, and admin fallback owner all live in Parameter Store under /ew/config/. Lambdas fetch config on cold start and cache for the lifetime of the execution environment.
Deploy
Whichever IaC you prefer. The opinionated bits: deploy the SES rule set as a separate stack (rule-set changes affect mail flow), turn on S3 versioning for both ew-registry-source and ew-rules-source so a bad Drive edit can be rolled back in one click, and version the EventBridge Scheduler timezone setting so you don’t accidentally start running the daily tick in UTC after a CI rotation. CDK with a Python stack file works well; SAM also fits. Total deployable surface: around eight Lambdas, four DDB tables, four S3 buckets, one EventBridge rule on the default bus (plus the Scheduler rules), one SES rule set, and one Budgets alarm.
That’s the full system. Six narrative posts and this engineering reference. If you want to talk about adapting it for your business, see Work with me.
All posts