How the numbers get gathered
The report is only as good as the numbers behind it. So the first job is reading every source cleanly and turning it into one set of weekly figures, without ever depending on a live system at 7am on a Monday. There are three kinds of source a small business usually has — a sheet someone keeps by hand, a CSV that a tool like Stripe or the bank produces, and a point-of-sale roll-up. Each gets mirrored to S3 on its own schedule, and then the gather step reads them all from one place.
Key takeaways
- Three kinds of source feed one set of figures: a hand-kept sheet, a tool CSV, and a point-of-sale roll-up.
- A small
source-syncLambda mirrors each source to S3 on a schedule, so the Monday run never hits a live API. - The gather step reads everything from S3 and normalizes it into one clean shape.
- It computes this week, last week, and the four-week average, so each figure arrives with its comparison.
- S3 versioning means a bad source edit can be rolled back in one click.
Three sources into one set of figures
Source 1: the hand-kept sheet
The most common source in a small business is a sheet somebody keeps by hand — the sales log, the new-customer list, the running tally of jobs done. The config doc names the sheet, the tab, and which columns hold which figure: which column is the amount, which is the date, which marks a new customer. A small Lambda — source-sync — runs on a schedule, exports the sheet as plain CSV via the Google Sheets API, and writes it to s3://wr-source-data/sales-sheet.csv only if the sheet has changed since the last sync. The gather step reads from S3, not Drive directly. That keeps Sheets API calls predictable and gives you S3 versioning for free, so a bad bulk-edit on Friday can be rolled back in one click on Monday.
This covers the figures a person types in as the week goes on. The builder never edits the sheet — it only reads a copy.
Source 2: the tool CSV (the one most owners already have)
Almost every business has at least one tool that already produces a clean export: Stripe for card payments, the bank for the account statement, the invoicing app for what was billed and what was paid. These arrive as CSV files. You set up the tool (or a tiny scheduled export) to drop the file in a Drive folder, and source-sync mirrors it to s3://wr-source-data/ whenever it changes. The gather step parses the columns named in the config doc — amount, date, customer, status — and pulls only the figures the report needs. No model reads the CSV; it’s plain column-by-column Python, which is exactly what you want for money figures: predictable, checkable, and the same every run.
Because these exports are the system of record for cash, the builder treats them as authoritative for the cash-in and cash-out figures. If the sales sheet and the Stripe export disagree on a total, that disagreement is itself something the checks in Part 5 will surface — rather than the builder silently picking one.
Source 3: the point-of-sale roll-up
A shop or a café has a point-of-sale system that already knows the daily totals and the best-selling items. Most of them can write a daily summary — a small file with the day’s sales, order count, and top products. The config doc points source-sync at wherever that file lands, and it’s pulled to S3 on a schedule like the others. The gather step reads the daily totals for the week and the top-item list, so the report can say not just “sales were up” but “the lunch special drove most of it.”
Point-of-sale is the most optional of the three. A business without one loses nothing; a shop that has one gets the top-items line in the report, which is often the most useful sentence in it.
Why everything funnels into one figure set
Three sources in, but only one set of figures the writer ever sees. That’s deliberate. If the writer had to reach into three different files in three different shapes, every “where did this number come from?” question would mean checking three places and three formats. Normalizing everything into one figure set — with a fixed list of fields and a fixed shape — means there is exactly one place the report’s numbers come from, and exactly one place to look when a figure seems wrong. The gather step does the messy reading; everything downstream works from one clean set.
Next post: how the weekly report gets written — how plain Python computes every comparison first, and how exactly one Bedrock call turns those real numbers into a short paragraph.
All posts