Engineering reference: the weekly report builder architecture
Same system, drawn for engineers. Region, service names, resource identifiers, Bedrock model IDs, Lambda inventory, IAM scopes, EventBridge Scheduler config, the DynamoDB schemas, and the grounding contract that keeps the model from ever sourcing a number. Read alongside the previous six posts; this one’s the build sheet.
Region and account shape
Default region: ap-southeast-1 (Singapore). Bedrock cross-Region inference, SES outbound, and EventBridge Scheduler are all in good shape there. A second region for multi-region resilience isn’t worth the extra setup work at SMB volume — the failure mode for an SMB is a report landing an hour late, not a regional outage. One AWS account dedicated to the builder (separate from your other workloads) keeps the IAM blast radius small and lets a single AWS Budgets alarm cover the whole system.
Topology
Lambda functions
All Lambdas use the arm64 architecture, the smallest memory size that meets latency targets (typically 256 MB), Python 3.14 runtime, and CloudWatch Logs at 7-day retention. Each function has its own least-privilege IAM role. None run inside a VPC.
source-sync— EventBridge Scheduler target, fires every 30 minutes. Uses the Google Drive API + Sheets API (service-account credentials in Secrets Manager underwr/google/sa) to export the hand-kept registry sheet as CSV and mirror any tool CSVs in the configured Drive folder tos3://wr-source-data/, writing only if a source has changed since the last sync. The same pattern syncs the config and voice docs tos3://wr-config-source/. Memory: 256 MB. Timeout: 30 s.pos-sync— EventBridge Scheduler target, scheduled per the POS’s roll-up cadence (typically nightly plus a morning catch-up). Pulls the point-of-sale daily summary from wherever it lands (an SFTP drop, a vendor API, or a Drive file) and writes it tos3://wr-source-data/pos/. Kept separate fromsource-syncbecause POS integrations vary the most and benefit from their own retry and timeout tuning. Memory: 256 MB. Timeout: 60 s.builder— EventBridge Scheduler target, weekly Monday 7am in the owner’s timezone (the schedule expression runs inTZ_NAME, e.g.Asia/Singapore). Reads every source froms3://wr-source-data/and the config and voice docs. Normalizes into one figure set; computes this week, last week, and the four-week average per figure; runs the three look-off checks; builds the facts list (flagged figures withheld); calls Bedrock Haiku 4.5 once for the summary; verifies every number in the draft against the figure set, dropping any unmatched sentence; assembles the report. Hands the assembled report to the send step in the same invocation. Memory: 512 MB. Timeout: 120 s. Exactly one Bedrock call per run.- send step — runs inside the
builderinvocation (not a separate function) once the report is assembled. Resolves recipients from the config doc, confirms the owner timezone from Parameter Store, runs the completeness check, composes the HTML email, and ships via SESSendRawEmailfrom the verified sender identity. On an incomplete week, instead of sending it creates a one-off EventBridge Scheduler rule that re-invokesbuilderin retry mode a couple of hours later. Writes a row towr-runsafter a successful send and any flags towr-flags. lookup-handler— Lambda Function URL, public withAuthType: NONE; verifies a short-lived signed token embedded in the email footer link. Triggered when the owner clicks “show the rows behind this figure.” Reads the relevant source slice froms3://wr-source-data/for the reported week and returns it as a simple HTML table. Read-only; writes nothing. Memory: 256 MB. Timeout: 15 s.
Storage
- DynamoDB ·
wr-runs— one row per weekly send. PK(owner_id, week_start); attributes:sent_at,recipients,figures(the reported figure set as a map),summary_text,dropped_sentencescount. On-demand. No TTL — this is what next week’s comparison reads. - DynamoDB ·
wr-flags— one row per flagged figure. PK(owner_id, week_start); sort keyfigure_check; attributes:check(stale/out_of_range/reconcile),figure,expected,actual,source. On-demand. No TTL — the long-term record of which sources misbehave. - S3 ·
wr-source-data— mirrored source CSVs and the POS roll-ups. Versioning enabled. Lifecycle to Glacier at 90 days; expiry at 7 years. - S3 ·
wr-config-source— mirrored config and voice docs as plain text. Versioning enabled. - S3 ·
wr-reports— the assembled HTML report for each week, kept for reference and for the lookup link. Versioning enabled. Lifecycle to Glacier at 90 days.
Bedrock
- Foundation model.
anthropic.claude-haiku-4-5-20251001-v1:0via the Global cross-Region inference profileglobal.anthropic.claude-haiku-4-5-20251001-v1:0. One callsite:builder, for the weekly summary paragraph. The heavierclaude-sonnet-4-6isn’t used — turning a short facts list into a paragraph is well within Haiku’s range and doesn’t justify the cost. - Grounding contract. The prompt hands the model only the computed facts list (statements with numbers attached) and instructs it to introduce no figure not in that list. The output is then number-checked in code against the figure set; any sentence whose number doesn’t match the set is dropped before send. The model never sees raw source data and never sources a number.
- Embeddings. Not used. The numbers are structured rows; deterministic arithmetic beats vector retrieval here. No Knowledge Base, no S3 Vectors.
- Quotas. Default account quotas are far more than enough — one call a week per owner.
EventBridge Scheduler config
wr-weekly-run—cron(0 7 ? * 2 *)(Monday 7am) in the owner’s timezone. Target:builderLambda.wr-source-sync—rate(30 minutes). Target:source-syncLambda.wr-pos-sync— per the POS cadence, e.g.cron(30 1 * * ? *)plus a morning catch-up. Target:pos-syncLambda.- One-off retry rules — created on the fly by the send step when the completeness check holds a send. Use
at(YYYY-MM-DDTHH:MM:SS)expressions in TZ with--action-after-completion DELETEso the rule self-cleans.
SES outbound
- Verify a sender identity at
reports@your-company.comwith DKIM and SPF on the parent domain so the weekly email lands in the inbox, not spam. - The send step uses
SendRawEmailso the report can be a full multipart HTML message with the numbers table inline. - Out of the SES sandbox by request before go-live; the recipient list is small and static, so this is a one-time step.
IAM (least privilege per Lambda)
Each Lambda has its own role with policies scoped to exact ARNs. Sketch:
- builder role:
s3:GetObjectonwr-source-data,wr-config-source;s3:PutObjectonwr-reports;dynamodb:Query+GetItem+PutItemonwr-runsandwr-flags;bedrock:InvokeModelon the Haiku ARN;ses:SendRawEmailfrom the verified sender;scheduler:CreateSchedulefor retry one-offs;ssm:GetParameteron/wr/config/*. - source-sync role:
secretsmanager:GetSecretValueon the Google service-account secret;s3:PutObjectonwr-source-dataandwr-config-source; outbound network towww.googleapis.com. - pos-sync role:
secretsmanager:GetSecretValueon the POS credential secret;s3:PutObjectonwr-source-data; outbound network to the POS endpoint only. - lookup-handler role:
s3:GetObjectonwr-source-dataandwr-reports;ssm:GetParameteron the link-signing key. Read-only — no write permissions, no Bedrock, no SES.
The grounding flow, in code
The contract that makes the report trustworthy is enforced in three places, not one. First, the gather step is the only thing that ever computes a figure; everything downstream reads from its output, never from raw sources. Second, the facts list handed to Bedrock is a closed set — flagged figures are excluded, so the model can’t even mention a number that failed a check. Third, the post-model number-check scans the draft, extracts every numeric token, and matches it against the figure set within a small rounding tolerance; an unmatched token drops its whole sentence and increments the dropped_sentences counter on the run.
The counter matters operationally: a run with a non-zero drop count is logged at WARN, and a sustained pattern of drops means the prompt or the model is drifting and should be reviewed. In steady state the count is zero — Haiku 4.5 handed a tight facts list and told to describe only it rarely strays — but the system is built so that “rarely” never reaches the owner as a wrong number.
Observability and cost gates
- CloudWatch Logs: all Lambdas, 7-day retention, structured JSON. Subscription filter on
"error"+"throttle"+"timeout"+"dropped_sentences"to a CloudWatch metric for alerting. - Alarms:
builderfailures > 0 on a Monday (the weekly run is the one piece that has to succeed); send failures > 0;dropped_sentences> 0 for two consecutive weeks (the model may be drifting); a source that trips the stale check three weeks running (fix it at the source). - X-Ray: off by default. Not worth the cost at SMB volume.
- AWS Budgets: $15/month threshold, alarm at 80% and 100%, posts to SNS topic
wr-cost-alarmsubscribed to the admin’s email.
Config and secrets
Service-account credentials for the Drive, Sheets, and Calendar APIs live in Secrets Manager under wr/google/sa (one service account, read-only scopes). POS credentials live under wr/pos/*. The configured timezone, the recipient list, the notable-change thresholds, the look-off check thresholds, and the link-signing key all live in Parameter Store under /wr/config/. Lambdas fetch config on cold start and cache for the lifetime of the execution environment. Nothing in the system has write access to any source — every Google and POS scope is read-only, which is the single most important guardrail: the builder can never alter the numbers it reports on.
Deploy
GitHub Actions with OIDC into a deploy role (no long-lived keys), and AWS SAM for the stack. The opinionated bits: turn on S3 versioning for wr-source-data and wr-config-source so a bad Drive edit can be rolled back in one click; version the EventBridge Scheduler timezone setting so a CI rotation can’t silently start running the weekly job in UTC; and keep every Google and POS credential scoped read-only so the builder is structurally incapable of writing to a source. Total deployable surface: around five Lambdas, two DDB tables, three S3 buckets, a handful of Scheduler rules, one verified SES identity, and one Budgets alarm.
That’s the full system. Six narrative posts and this engineering reference. If you want to talk about adapting it for your business, see Work with me.
All posts