Part 7 of 7 · Price monitor series ~8 min read

Engineering reference: the price monitor architecture

Same system, drawn for engineers. Region, service names, resource identifiers, Bedrock model IDs, Lambda inventory, IAM scopes, the polite-fetch policy, EventBridge Scheduler config, the DynamoDB schemas, and the Slack interactive flow. Read alongside the previous six posts; this one’s the build sheet.

Region and account shape

Default region: ap-southeast-1 (Singapore). Bedrock cross-Region inference, EventBridge Scheduler, and SES outbound are all in good shape there. A second region for resilience isn’t worth the extra setup at SMB volume — the failure mode for an SMB is missing a price move for a day, not a regional outage. One AWS account dedicated to the monitor (separate from your other workloads) keeps the IAM blast radius small and lets a single AWS Budgets alarm cover the whole system. Note the deliberate absence of any write path to a store or catalog — the monitor has no credentials that could change a price anywhere.

Topology

AWS topology of the price monitor A topology diagram with three regions stacked vertically inside one AWS account boundary. Top region: ingress. Three boxes show the three intake lanes — a Drive sheet sync via the drive-sync Lambda triggered every 15 minutes by EventBridge Scheduler that mirrors the watch-list CSV to s3://pm-watchlist-source/, a paste-a-link reader Lambda intake-link-reader that fetches a pasted URL once, tries a plain price rule, and falls back to Bedrock Haiku 4.5 to propose a saved rule for Slack approval, and a catalog-sync Lambda triggered hourly by EventBridge Scheduler that matches tagged catalog products to competitor pages and proposes rows the same way. Middle region: scheduled processing. The checker Lambda is triggered on each page's stagger by EventBridge Scheduler; it fetches the page politely, reads the price with the saved rule, writes a reading to DynamoDB pm-readings, and hands off to the watcher logic that computes the percent change, looks up the threshold in s3://pm-rules-source/rules.txt, reads alert and mute state from DynamoDB, and emits one of three events to the EventBridge default bus per page that needs an alert: pm.first_alert, pm.repeat_move, or pm.big_swing. Bottom region: dispatch and handling. The dispatch Lambda is triggered by an EventBridge rule on those three event types; it resolves the owner, checks quiet hours and the daily cap, fetches the alert template from s3://pm-rules-source/voice.txt, posts the message to Slack via chat.postMessage with Note, Mute, and Stop buttons or sends an email via SES outbound, and writes a row to DynamoDB pm-alerts. Slack interactive button clicks land on a Function URL Lambda action-handler that updates pm-mute and pm-audit and, on stop, flags the watch-list row via the Google Sheets API. CloudWatch Logs collects from every Lambda at 7-day retention. Across the right edge: a small box labelled AWS Budgets alarm at $15 monthly threshold, posting to SNS topic pm-cost-alarm. A note at the bottom: the monitor only suggests — every decision is logged to pm-audit and no price is ever changed. Ingress Lambda · drive-sync every 15 min Sheets API → s3://pm-watchlist-source/ watchlist.csv Lambda · link reader intake-link-reader fetch once, plain rule else Haiku 4.5 rule → Slack proposal Lambda · catalog-sync hourly poll match tagged products to competitor pages → Slack proposal Drive watch list canonical store · mirrored to S3 Scheduled processing EventBridge Scheduler staggered per page in TZ_NAME target: checker Lambda + deferred one-offs Lambda · checker polite fetch + read + rules.txt + voice.txt compares change, picks one of four moves EventBridge default bus pm.first_alert pm.repeat_move pm.big_swing (steady → no event) Dispatch & handling Lambda · dispatch resolves owner, quiet hours, daily cap; Slack message or SES outbound Slack interactive DM with [Note] [Mute] [Stop] button clicks → Function URL Lambda · action-handler writes pm-mute, pm-audit, and on stop flags the row via Sheets API The monitor only suggests — every decision is logged to pm-audit and no price is ever changed.
Fig 7. AWS topology, in three regions of the diagram: ingress (three lanes into the watch list), scheduled processing (the staggered checks emitting events), dispatch and handling (the alert ships and the owner’s response is recorded). Every Lambda is event- or schedule-driven; nothing is synchronous-chained.

Lambda functions

All Lambdas use the arm64 architecture, the smallest memory size that meets latency targets (typically 256 MB), Python 3.14 runtime, and CloudWatch Logs at 7-day retention. Each function has its own least-privilege IAM role. None run inside a VPC; outbound fetches go straight to the internet over the Lambda’s default egress.

  • drive-sync — EventBridge Scheduler target, fires every 15 minutes. Uses the Google Drive API + Sheets API (service-account credentials in Secrets Manager under pm/drive/sa) to export the watch-list sheet as CSV and write to s3://pm-watchlist-source/watchlist.csv only if the sheet has changed since the last sync. The same pattern syncs the rules and voice docs to s3://pm-rules-source/. Memory: 256 MB. Timeout: 30 s.
  • intake-link-reader — invoked by the Slack /watch slash command (via the action-handler Function URL) or by a dedicated channel webhook. Fetches the pasted URL once through the polite-fetch helper, tries a deterministic price rule (JSON-LD offers.price, schema.org microdata, OpenGraph product:price:amount, then a labelled-element heuristic). On miss, calls Bedrock Haiku 4.5 (anthropic.claude-haiku-4-5-20251001-v1:0 via global.anthropic.claude-haiku-4-5-20251001-v1:0) with the cleaned visible text to propose a price and a saved selector rule. Posts the proposal to Slack with Approve/Edit/Discard. Memory: 512 MB. Timeout: 60 s.
  • catalog-sync — EventBridge Scheduler target, hourly. Reads the catalog export (store API or a nightly CSV drop in s3://pm-catalog-source/) for products tagged to watch, finds newly tagged rows with a competitor URL, and routes each through the same propose-and-approve flow as the link reader. Read-only against the catalog — it never writes back. Memory: 256 MB. Timeout: 30 s.
  • checker — EventBridge Scheduler targets, staggered per page across the day (the schedule expressions run in TZ_NAME set to the SMB’s timezone, e.g. Asia/Singapore). For each page due, runs the polite-fetch helper, applies the saved price rule, and writes a reading to pm-readings. Then runs the watcher logic inline: computes the percent change against the last reading, reads pm-alerts and pm-mute, decides on a move. Emits one event per page that needs action: pm.first_alert, pm.repeat_move, or pm.big_swing, with the page context as the payload. Steady pages emit nothing. If the saved rule misses, enqueues a re-read job to intake-link-reader instead of alerting. Memory: 512 MB. Timeout: 60 s. No Bedrock calls on the read/compare path.
  • dispatch — EventBridge rule on the three move events. Resolves owner, checks quiet hours and the per-owner daily cap, formats the alert from the voice template (including a short text sparkline from the last N readings), and ships via Slack chat.postMessage (pm/slack/bot-token in Secrets Manager) or SES SendRawEmail. On a quiet-hours defer, creates a one-off EventBridge Scheduler rule that re-invokes dispatch at the next business minute. On a cap hit, appends to that owner’s digest buffer instead of sending (big swings bypass the cap). Writes a row to pm-alerts after a successful send. Memory: 256 MB. Timeout: 30 s.
  • action-handler — Lambda Function URL, public with AuthType: NONE; verifies a Slack signature on the request body. Triggered by Slack interactive button clicks (Note/Mute/Stop), the /watch slash command, and email-link clicks. Writes to pm-mute and pm-audit; on stop, flags the watch-list row via the Sheets API. Holds no credential that can edit any price on any store. Memory: 256 MB. Timeout: 15 s.
  • digest — EventBridge Scheduler target, weekly Sunday 6pm plus a daily flush of any capped alerts. Reads pm-alerts and pm-readings; sends a digest to a configured Slack channel summarizing moves, capped alerts, and any pages that failed to read. No Bedrock; a plain summary table. Memory: 256 MB.
  • summary — EventBridge Scheduler target, monthly on the first Monday at 9am. Reads the past month’s pm-readings, pm-alerts, and pm-audit; calls Bedrock Haiku 4.5 to write a one-paragraph narrative of how the market moved and where you sit; emails it via SES to the configured stakeholder list. Memory: 512 MB.

Polite-fetch policy

The single shared fetch helper used by checker and intake-link-reader enforces the etiquette so no individual function can forget it:

  • robots.txt. Parsed and cached per host (TTL 24h in pm-robots); a disallowed path is never fetched and the page is flagged in the watch list as blocked. Any Crawl-delay is honored as a floor on that host’s interval.
  • Identification. A descriptive User-Agent naming the business and a contact URL. No browser-impersonation, no rotating agents.
  • Rate. One request per page per scheduled run, staggered across the day; a per-host concurrency of 1 with a minimum gap so two pages on the same host never fire together.
  • Backoff. On a 429 or 5xx, exponential backoff with jitter; after a configurable number of consecutive failures the page is paused (a paused flag in pm-readings state) and surfaced in the weekly digest rather than retried in a tight loop.
  • Footprint. Plain HTTP fetch with a small response cap; no headless browser, no JavaScript execution unless a per-page render: true flag is set (rare, and rate-limited harder). Conditional requests (ETag/If-Modified-Since) are used where the host supports them.

Storage

  • DynamoDB · pm-readings — one row per check. PK page_id; sort key ts; attributes: price, currency, in_stock, http_status, paused. On-demand. TTL on raw readings at 400 days (history beyond that is aggregated into the monthly summary).
  • DynamoDB · pm-alerts — one row per dispatch. PK (page_id, move); attributes: alert_date, dispatched_via (slack/email/digest), recipient, old_price, new_price, pct. On-demand.
  • DynamoDB · pm-mute — one row per active mute. PK page_id; attributes: mute_until, by_user, reason. On-demand. TTL on mute_until so expired mutes self-clean.
  • DynamoDB · pm-audit — one row per write action of any kind. PK (page_id, ts); attributes: action (note/mute/stop/approve), by_user, before, after, note. On-demand. No TTL — long-term decision trail.
  • S3 · pm-watchlist-source — mirrored CSV from the Drive watch-list sheet. Versioning enabled. Lifecycle to Glacier at 90 days.
  • S3 · pm-rules-source — mirrored rules and voice docs as plain text. Versioning enabled.
  • S3 · pm-snapshots — the saved HTML snapshot from the last successful read of each page, used by the re-read lane when a layout breaks the saved rule. Lifecycle expiry at 30 days.

Bedrock

  • Foundation model. anthropic.claude-haiku-4-5-20251001-v1:0 via the Global cross-Region inference profile global.anthropic.claude-haiku-4-5-20251001-v1:0. Two callsites: intake-link-reader for proposing a price rule when the deterministic rules miss, and summary for the monthly narrative. A heavier reasoning model (anthropic.claude-sonnet-4-6) is not used here — reading a price off a page is a Haiku-class task.
  • Embeddings. Not used. The watch list is structured rows and price extraction is deterministic-first; no vector retrieval, no Knowledge Base, no S3 Vectors.
  • Quotas. Default account quotas are more than enough at SMB volume. The routine check doesn’t call Bedrock; the re-read lane fires only when a layout changes.

EventBridge Scheduler config

  • pm-check-{bucket} — a small set of staggered rate(...) schedules (e.g. four buckets at rate(6 hours) offset by 90 minutes) so pages spread across the day. Target: checker Lambda with the bucket id as input.
  • pm-drive-syncrate(15 minutes). Target: drive-sync Lambda.
  • pm-catalog-syncrate(1 hour). Target: catalog-sync Lambda.
  • pm-weekly-digestcron(0 18 ? * SUN *) in TZ, plus pm-daily-flush cron(0 17 * * ? *) for capped alerts. Target: digest Lambda.
  • pm-monthly-summarycron(0 9 ? * 2#1 *) (first Monday at 9am) in TZ. Target: summary Lambda.
  • One-off rules — created on the fly by dispatch for quiet-hours defers. Use at(YYYY-MM-DDTHH:MM:SS) with --action-after-completion DELETE so the rule self-cleans.

SES outbound and Slack

  • SES outbound for the email-fallback alerts and the monthly summary: verify a sender identity at monitor@your-company.com with DKIM and SPF on the parent domain. Out of sandbox by request. No SES inbound is used — this system has no email intake lane.
  • Alert messages are posted via the Slack chat.postMessage Web API with Block Kit blocks containing the Note/Mute/Stop buttons. Button clicks and the /watch slash command are sent by Slack to the action-handler Function URL, which verifies the signing secret, parses the action_id (note, mute, stop, watch), opens a modal where needed (Note/Mute open modals; Stop is one-tap), and processes the submission.
  • The Slack app needs chat:write, im:write, commands, and the Interactivity URL configured. The bot token lives in Secrets Manager under pm/slack/bot-token; the signing secret under pm/slack/signing-secret.

IAM (least privilege per Lambda)

Each Lambda has its own role with policies scoped to exact ARNs. Sketch:

  • checker role: s3:GetObject on the watch-list, rules, and snapshot keys; s3:PutObject on pm-snapshots; dynamodb:Query + PutItem on pm-readings, dynamodb:Query on pm-alerts and pm-mute; events:PutEvents on the default bus. No bedrock:*. Outbound internet for fetches.
  • dispatch role: scheduler:CreateSchedule for the deferred-dispatch one-offs; secretsmanager:GetSecretValue on the Slack bot token; ses:SendRawEmail from the verified sender identity; dynamodb:PutItem on pm-alerts; outbound network to slack.com.
  • action-handler role: dynamodb:PutItem on pm-mute and pm-audit; secretsmanager:GetSecretValue on the Sheets-API service-account secret; outbound network to sheets.googleapis.com; lambda:InvokeFunction on intake-link-reader for the /watch command. No store or catalog write scope.
  • intake-link-reader role: s3:GetObject/PutObject on pm-snapshots; bedrock:InvokeModel on the Haiku ARN; secretsmanager:GetSecretValue on the Slack bot token; outbound internet for fetches.
  • drive-sync and catalog-sync roles: secretsmanager:GetSecretValue on the Google service-account secret; s3:PutObject on the watch-list and rules buckets; s3:GetObject on pm-catalog-source (catalog-sync only, read-only); outbound network to www.googleapis.com.

Observability and cost gates

  • CloudWatch Logs: all Lambdas, 7-day retention, structured JSON. Subscription filter on "error" + "throttle" + "blocked" to a CloudWatch metric for alerting.
  • Alarms: checker failure rate > 5% in 24h (covers a widespread layout break or a host blocking us); dispatch failure rate > 1% in 24h; action-handler signature-verification failures > 5/hour (might mean the Slack secret rotated); a “stale page” metric for any page with no successful read in 48h.
  • X-Ray: off by default. Not worth the cost at SMB volume.
  • AWS Budgets: $15/month threshold, alarm at 80% and 100%, posts to SNS topic pm-cost-alarm subscribed to the on-call admin’s email and Slack.

Config and secrets

Service-account credentials for Drive, Sheets, and the catalog API live in Secrets Manager under pm/drive/sa and pm/catalog/*. Slack bot token and signing secret under pm/slack/*. SES sender identity lives in IAM and the verified-domain config. The configured timezone, quiet-hours window, per-owner daily cap, default move threshold, and admin fallback owner all live in Parameter Store under /pm/config/. Lambdas fetch config on cold start and cache for the lifetime of the execution environment.

Deploy

Whichever IaC you prefer. The opinionated bits: turn on S3 versioning for both pm-watchlist-source and pm-rules-source so a bad Drive edit can be rolled back in one click; keep the polite-fetch helper in a shared layer so etiquette is enforced in one place; and version the EventBridge Scheduler timezone setting so you don’t accidentally start checking on a UTC clock after a CI rotation. CDK with a Python stack file works well; SAM also fits. Total deployable surface: around eight Lambdas, four DDB tables, three S3 buckets, one EventBridge rule on the default bus (plus the Scheduler rules), one SES sender identity, and one Budgets alarm. Note what isn’t in the surface: any credential or path that could change a price.

That’s the full system. Six narrative posts and this engineering reference. If you want to talk about adapting it for your business, see Work with me.

All posts