Engineering reference: the price monitor architecture
Same system, drawn for engineers. Region, service names, resource identifiers, Bedrock model IDs, Lambda inventory, IAM scopes, the polite-fetch policy, EventBridge Scheduler config, the DynamoDB schemas, and the Slack interactive flow. Read alongside the previous six posts; this one’s the build sheet.
Region and account shape
Default region: ap-southeast-1 (Singapore). Bedrock cross-Region inference, EventBridge Scheduler, and SES outbound are all in good shape there. A second region for resilience isn’t worth the extra setup at SMB volume — the failure mode for an SMB is missing a price move for a day, not a regional outage. One AWS account dedicated to the monitor (separate from your other workloads) keeps the IAM blast radius small and lets a single AWS Budgets alarm cover the whole system. Note the deliberate absence of any write path to a store or catalog — the monitor has no credentials that could change a price anywhere.
Topology
Lambda functions
All Lambdas use the arm64 architecture, the smallest memory size that meets latency targets (typically 256 MB), Python 3.14 runtime, and CloudWatch Logs at 7-day retention. Each function has its own least-privilege IAM role. None run inside a VPC; outbound fetches go straight to the internet over the Lambda’s default egress.
drive-sync— EventBridge Scheduler target, fires every 15 minutes. Uses the Google Drive API + Sheets API (service-account credentials in Secrets Manager underpm/drive/sa) to export the watch-list sheet as CSV and write tos3://pm-watchlist-source/watchlist.csvonly if the sheet has changed since the last sync. The same pattern syncs the rules and voice docs tos3://pm-rules-source/. Memory: 256 MB. Timeout: 30 s.intake-link-reader— invoked by the Slack/watchslash command (via theaction-handlerFunction URL) or by a dedicated channel webhook. Fetches the pasted URL once through the polite-fetch helper, tries a deterministic price rule (JSON-LDoffers.price, schema.org microdata, OpenGraphproduct:price:amount, then a labelled-element heuristic). On miss, calls Bedrock Haiku 4.5 (anthropic.claude-haiku-4-5-20251001-v1:0viaglobal.anthropic.claude-haiku-4-5-20251001-v1:0) with the cleaned visible text to propose a price and a saved selector rule. Posts the proposal to Slack with Approve/Edit/Discard. Memory: 512 MB. Timeout: 60 s.catalog-sync— EventBridge Scheduler target, hourly. Reads the catalog export (store API or a nightly CSV drop ins3://pm-catalog-source/) for products tagged to watch, finds newly tagged rows with a competitor URL, and routes each through the same propose-and-approve flow as the link reader. Read-only against the catalog — it never writes back. Memory: 256 MB. Timeout: 30 s.checker— EventBridge Scheduler targets, staggered per page across the day (the schedule expressions run inTZ_NAMEset to the SMB’s timezone, e.g.Asia/Singapore). For each page due, runs the polite-fetch helper, applies the saved price rule, and writes a reading topm-readings. Then runs the watcher logic inline: computes the percent change against the last reading, readspm-alertsandpm-mute, decides on a move. Emits one event per page that needs action:pm.first_alert,pm.repeat_move, orpm.big_swing, with the page context as the payload. Steady pages emit nothing. If the saved rule misses, enqueues a re-read job tointake-link-readerinstead of alerting. Memory: 512 MB. Timeout: 60 s. No Bedrock calls on the read/compare path.dispatch— EventBridge rule on the three move events. Resolves owner, checks quiet hours and the per-owner daily cap, formats the alert from the voice template (including a short text sparkline from the last N readings), and ships via Slackchat.postMessage(pm/slack/bot-tokenin Secrets Manager) or SESSendRawEmail. On a quiet-hours defer, creates a one-off EventBridge Scheduler rule that re-invokesdispatchat the next business minute. On a cap hit, appends to that owner’s digest buffer instead of sending (big swings bypass the cap). Writes a row topm-alertsafter a successful send. Memory: 256 MB. Timeout: 30 s.action-handler— Lambda Function URL, public withAuthType: NONE; verifies a Slack signature on the request body. Triggered by Slack interactive button clicks (Note/Mute/Stop), the/watchslash command, and email-link clicks. Writes topm-muteandpm-audit; on stop, flags the watch-list row via the Sheets API. Holds no credential that can edit any price on any store. Memory: 256 MB. Timeout: 15 s.digest— EventBridge Scheduler target, weekly Sunday 6pm plus a daily flush of any capped alerts. Readspm-alertsandpm-readings; sends a digest to a configured Slack channel summarizing moves, capped alerts, and any pages that failed to read. No Bedrock; a plain summary table. Memory: 256 MB.summary— EventBridge Scheduler target, monthly on the first Monday at 9am. Reads the past month’spm-readings,pm-alerts, andpm-audit; calls Bedrock Haiku 4.5 to write a one-paragraph narrative of how the market moved and where you sit; emails it via SES to the configured stakeholder list. Memory: 512 MB.
Polite-fetch policy
The single shared fetch helper used by checker and intake-link-reader enforces the etiquette so no individual function can forget it:
- robots.txt. Parsed and cached per host (TTL 24h in
pm-robots); a disallowed path is never fetched and the page is flagged in the watch list as blocked. AnyCrawl-delayis honored as a floor on that host’s interval. - Identification. A descriptive
User-Agentnaming the business and a contact URL. No browser-impersonation, no rotating agents. - Rate. One request per page per scheduled run, staggered across the day; a per-host concurrency of 1 with a minimum gap so two pages on the same host never fire together.
- Backoff. On a 429 or 5xx, exponential backoff with jitter; after a configurable number of consecutive failures the page is paused (a
pausedflag inpm-readingsstate) and surfaced in the weekly digest rather than retried in a tight loop. - Footprint. Plain HTTP fetch with a small response cap; no headless browser, no JavaScript execution unless a per-page
render: trueflag is set (rare, and rate-limited harder). Conditional requests (ETag/If-Modified-Since) are used where the host supports them.
Storage
- DynamoDB ·
pm-readings— one row per check. PKpage_id; sort keyts; attributes:price,currency,in_stock,http_status,paused. On-demand. TTL on raw readings at 400 days (history beyond that is aggregated into the monthly summary). - DynamoDB ·
pm-alerts— one row per dispatch. PK(page_id, move); attributes:alert_date,dispatched_via(slack/email/digest),recipient,old_price,new_price,pct. On-demand. - DynamoDB ·
pm-mute— one row per active mute. PKpage_id; attributes:mute_until,by_user,reason. On-demand. TTL onmute_untilso expired mutes self-clean. - DynamoDB ·
pm-audit— one row per write action of any kind. PK(page_id, ts); attributes:action(note/mute/stop/approve),by_user,before,after,note. On-demand. No TTL — long-term decision trail. - S3 ·
pm-watchlist-source— mirrored CSV from the Drive watch-list sheet. Versioning enabled. Lifecycle to Glacier at 90 days. - S3 ·
pm-rules-source— mirrored rules and voice docs as plain text. Versioning enabled. - S3 ·
pm-snapshots— the saved HTML snapshot from the last successful read of each page, used by the re-read lane when a layout breaks the saved rule. Lifecycle expiry at 30 days.
Bedrock
- Foundation model.
anthropic.claude-haiku-4-5-20251001-v1:0via the Global cross-Region inference profileglobal.anthropic.claude-haiku-4-5-20251001-v1:0. Two callsites:intake-link-readerfor proposing a price rule when the deterministic rules miss, andsummaryfor the monthly narrative. A heavier reasoning model (anthropic.claude-sonnet-4-6) is not used here — reading a price off a page is a Haiku-class task. - Embeddings. Not used. The watch list is structured rows and price extraction is deterministic-first; no vector retrieval, no Knowledge Base, no S3 Vectors.
- Quotas. Default account quotas are more than enough at SMB volume. The routine check doesn’t call Bedrock; the re-read lane fires only when a layout changes.
EventBridge Scheduler config
pm-check-{bucket}— a small set of staggeredrate(...)schedules (e.g. four buckets atrate(6 hours)offset by 90 minutes) so pages spread across the day. Target:checkerLambda with the bucket id as input.pm-drive-sync—rate(15 minutes). Target:drive-syncLambda.pm-catalog-sync—rate(1 hour). Target:catalog-syncLambda.pm-weekly-digest—cron(0 18 ? * SUN *)in TZ, pluspm-daily-flushcron(0 17 * * ? *)for capped alerts. Target:digestLambda.pm-monthly-summary—cron(0 9 ? * 2#1 *)(first Monday at 9am) in TZ. Target:summaryLambda.- One-off rules — created on the fly by
dispatchfor quiet-hours defers. Useat(YYYY-MM-DDTHH:MM:SS)with--action-after-completion DELETEso the rule self-cleans.
SES outbound and Slack
- SES outbound for the email-fallback alerts and the monthly summary: verify a sender identity at
monitor@your-company.comwith DKIM and SPF on the parent domain. Out of sandbox by request. No SES inbound is used — this system has no email intake lane. - Alert messages are posted via the Slack
chat.postMessageWeb API with Block Kit blocks containing the Note/Mute/Stop buttons. Button clicks and the/watchslash command are sent by Slack to theaction-handlerFunction URL, which verifies the signing secret, parses theaction_id(note,mute,stop,watch), opens a modal where needed (Note/Mute open modals; Stop is one-tap), and processes the submission. - The Slack app needs
chat:write,im:write,commands, and the Interactivity URL configured. The bot token lives in Secrets Manager underpm/slack/bot-token; the signing secret underpm/slack/signing-secret.
IAM (least privilege per Lambda)
Each Lambda has its own role with policies scoped to exact ARNs. Sketch:
- checker role:
s3:GetObjecton the watch-list, rules, and snapshot keys;s3:PutObjectonpm-snapshots;dynamodb:Query+PutItemonpm-readings,dynamodb:Queryonpm-alertsandpm-mute;events:PutEventson the default bus. Nobedrock:*. Outbound internet for fetches. - dispatch role:
scheduler:CreateSchedulefor the deferred-dispatch one-offs;secretsmanager:GetSecretValueon the Slack bot token;ses:SendRawEmailfrom the verified sender identity;dynamodb:PutItemonpm-alerts; outbound network toslack.com. - action-handler role:
dynamodb:PutItemonpm-muteandpm-audit;secretsmanager:GetSecretValueon the Sheets-API service-account secret; outbound network tosheets.googleapis.com;lambda:InvokeFunctiononintake-link-readerfor the/watchcommand. No store or catalog write scope. - intake-link-reader role:
s3:GetObject/PutObjectonpm-snapshots;bedrock:InvokeModelon the Haiku ARN;secretsmanager:GetSecretValueon the Slack bot token; outbound internet for fetches. - drive-sync and catalog-sync roles:
secretsmanager:GetSecretValueon the Google service-account secret;s3:PutObjecton the watch-list and rules buckets;s3:GetObjectonpm-catalog-source(catalog-sync only, read-only); outbound network towww.googleapis.com.
Observability and cost gates
- CloudWatch Logs: all Lambdas, 7-day retention, structured JSON. Subscription filter on
"error"+"throttle"+"blocked"to a CloudWatch metric for alerting. - Alarms: checker failure rate > 5% in 24h (covers a widespread layout break or a host blocking us); dispatch failure rate > 1% in 24h; action-handler signature-verification failures > 5/hour (might mean the Slack secret rotated); a “stale page” metric for any page with no successful read in 48h.
- X-Ray: off by default. Not worth the cost at SMB volume.
- AWS Budgets: $15/month threshold, alarm at 80% and 100%, posts to SNS topic
pm-cost-alarmsubscribed to the on-call admin’s email and Slack.
Config and secrets
Service-account credentials for Drive, Sheets, and the catalog API live in Secrets Manager under pm/drive/sa and pm/catalog/*. Slack bot token and signing secret under pm/slack/*. SES sender identity lives in IAM and the verified-domain config. The configured timezone, quiet-hours window, per-owner daily cap, default move threshold, and admin fallback owner all live in Parameter Store under /pm/config/. Lambdas fetch config on cold start and cache for the lifetime of the execution environment.
Deploy
Whichever IaC you prefer. The opinionated bits: turn on S3 versioning for both pm-watchlist-source and pm-rules-source so a bad Drive edit can be rolled back in one click; keep the polite-fetch helper in a shared layer so etiquette is enforced in one place; and version the EventBridge Scheduler timezone setting so you don’t accidentally start checking on a UTC clock after a CI rotation. CDK with a Python stack file works well; SAM also fits. Total deployable surface: around eight Lambdas, four DDB tables, three S3 buckets, one EventBridge rule on the default bus (plus the Scheduler rules), one SES sender identity, and one Budgets alarm. Note what isn’t in the surface: any credential or path that could change a price.
That’s the full system. Six narrative posts and this engineering reference. If you want to talk about adapting it for your business, see Work with me.
All posts