Part 5 of 7 · Weekly report builder series ~5 min read

How a number that looks off gets flagged

The whole promise of the report is that the owner can trust it. That promise only holds if the builder is honest about the numbers it isn’t sure of. A blank source read as zero, a figure ten times its normal size because a tool double-counted, a sales total that doesn’t match the cash that came in — if any of those gets reported as a clean fact, the report has lied. This post walks through the three kinds of check that catch those before the report is written, and what happens to a number that trips one.

Key takeaways

  • The checks run on the gathered figures before the report is written, all in plain Python.
  • Three kinds of flag: a stale source, a figure out of its normal range, and a total that doesn’t reconcile.
  • A flagged figure is shown but clearly marked unverified — it never appears as a plain fact.
  • Everything flagged is collected into a Needs a look section at the top of the report.
  • Every flag is written to wr-flags so you can see how often a source misbehaves.

Three kinds of flag

Three kinds of check on a gathered figure A diagram showing one input on the left flowing through the figure set, then branching into three check paths. Far left: a "Gathered figure set" box showing this week's numbers with their sources and last-sync times. The checks run on each figure. The middle column shows the three branches. Branch one, Stale source: a source's last sync time is older than the week it should cover, or a figure came back blank or zero where the business never has a real zero; the figure is marked unverified and the source is named. Branch two, Out of range: a figure sits far outside its normal range — more than the configured number of standard deviations from the trailing weeks, or a multiple of last week that no real week reaches; the figure is shown with what was expected. Branch three, Doesn't reconcile: two sources that should agree don't — the sales total and the cash-in figure differ by more than a small tolerance, or the parts of a total don't sum to the whole; both figures are shown side by side. The right side shows the convergence: every flag is written to the wr-flags DynamoDB table with the run date, the figure, the check that tripped, the expected value, and the actual value, and is collected into the Needs a look section of the report. A note at the bottom: a flagged number is shown but never stated as fact — the owner decides, the builder never guesses. Gathered figures numbers, sources, times [stale?] [out of range?] [reconciles?] Check 1 Stale source • Last sync older than the week it covers • Or blank where there is never a real zero Check 2 Out of range • Far from the trailing weeks’ normal range • Shown with what was expected Check 3 Doesn’t reconcile • Sales and cash-in differ past tolerance • Both shown side by side Needs a look DynamoDB wr-flags run date · figure check · expected actual A flagged number is shown but never stated as fact — the owner decides, the builder never guesses.
Fig 5. Three kinds of check, three ways a figure can look off. A stale source, a figure out of its normal range, and two numbers that don’t reconcile. Each flag is shown in the report, written to wr-flags, and never stated as plain fact.

Check 1: a source that didn’t update (the most common)

The quietest failure is a source that simply didn’t refresh. The bank export didn’t land this week, so the cash figures are last week’s. The point-of-sale was offline Saturday, so the weekend is missing. The danger isn’t a loud error — it’s that a missing source reads as a zero or a stale repeat, and a zero looks like a real number. Check 1 compares each source’s last sync time against the week the report covers. If a source hasn’t synced through the end of the week, or a figure comes back blank where this business never has a true zero (a shop that always sells something can’t have a $0 sales day), the figure is marked unverified and the source is named in the flag.

This is the check that pairs with the completeness gate from Part 4. The gate tries to hold the send until everything is in; if the data still isn’t complete by the last retry, this check makes sure the partial figure is labelled rather than reported straight.

Check 2: a figure far outside its normal range

Some bad numbers are present, not missing — they’re just wrong. A tool double-counts a batch and sales come back at ten times a normal week. A currency column gets read in cents and every figure is off by a hundred. Check 2 catches these by comparing each figure against its own recent history: the trailing several weeks set a normal range, and anything more than the configured number of standard deviations outside it — or a multiple of last week that no real week ever reaches — trips the flag. The figure is still shown, but next to it the report says what was expected: “$184,000 this week — the four-week average is $18,000; this looks off, please check the source.”

The threshold is in the config doc, so a business with genuinely spiky weeks (seasonal, event-driven) can widen the range and not get flagged every busy week. The goal is to catch the impossible, not to second-guess a real good week.

Check 3: numbers that should agree but don’t

The strongest check is one number against another. Sales recorded in the sales sheet should roughly match the cash that came in through Stripe and the bank. The parts of a total — sales by category, or by location — should sum to the total. Check 3 compares the pairs that ought to agree and flags any gap bigger than a small tolerance. When sales say $20,000 but cash-in says $14,000, that’s not necessarily an error — it could be timing, refunds, or an unpaid invoice — but it’s exactly the kind of thing the owner should see. Both figures are shown side by side with the gap named, and the builder makes no attempt to pick which one is “right.”

Reconciliation is the check that most often surfaces something real rather than a glitch — a refund batch nobody logged, an invoice that was sent but not paid, a category that stopped reporting. It’s the difference between a report that lists numbers and one that notices when they disagree.

What a flag does to the report

Every flag from all three checks is collected into a Needs a look section that sits at the very top of the report, above the summary. Each entry names the figure, the check that tripped, what was expected, and what was actually there. The flagged figure still appears in the numbers table below, but marked — a small “unverified” tag — so it’s never mistaken for a clean fact. Crucially, a flagged figure is also withheld from the writer’s facts list in Part 3: the model is never handed a number that failed a check, so the summary paragraph can’t accidentally state a bad figure as if the week really went that way.

Every flag is also written to the wr-flags DynamoDB table with the run date, the figure, the check, the expected value, and the actual value. Over a few months that table tells its own story — the bank export that’s late one Monday in three, the point-of-sale that drops every long weekend. A source that keeps tripping the same flag is a source to fix at the source, and the table is the evidence for that conversation.

Next post: the cost breakdown. The whole pipeline above runs in coffee-money territory at SMB volume; Part 6 explains exactly where the dollars go and why the one model call a week is the biggest single line.

All posts