A supplier bill matcher on AWS for a few dollars a month
A small business pays more supplier bills than anyone checks carefully. The bill that quietly charges $4.20 a unit when the purchase order said $3.80. The one for 100 boxes when only 84 turned up at the dock. The duplicate that arrives twice and gets paid twice because two people each thought the other had it. The one with no purchase order at all that somebody pays just to make the supplier stop calling. Checking every bill against what you ordered and what actually arrived is slow, dull, and the first thing that slips when the week gets busy. This post walks through the design of a small matcher that reads every supplier bill, lines it up against the purchase order and the goods-received note, checks it line by line, and never pays anything on its own.
Key takeaways
- Three sources for bills: an emailed-PDF lane, a supplier-portal poll, and a manual upload.
- Every bill ends in one of four outcomes: matched, price variance, quantity variance, or no PO.
- The match is a three-way check: the bill against the purchase order and the goods-received note.
- Clean matches clear for one-tap approval; anything off is flagged with the exact line and gap.
- Designed on AWS for about $3.20/month at typical small-business volume. It never pays a bill on its own.
The whole system on one page
Before any code, here’s the shape of what we’re designing.
What you set up once (the outside)
- Orders and receipts. A Google Sheet in a Drive folder with two tabs. The purchase orders tab has one row per ordered line: PO number, supplier, item code, item name, ordered quantity, agreed unit price, and an open/closed flag. The goods-received tab has one row per delivery line: PO number, item code, quantity actually received, and the date it arrived at the dock. You already produce both of these in the normal course of buying things; this just keeps them where the matcher can read them. New bills enter via three lanes covered in Part 2 — an emailed-PDF lane (forward a supplier bill to a dedicated address), a supplier-portal lane (the matcher polls portals that publish bills), and a manual-upload lane.
- A rules doc. One short Google Doc in the same Drive folder. It holds the match tolerances — how much a price or quantity is allowed to drift before the matcher flags it. A common set: price may be within 2% or $5 (whichever is larger) before it’s a price variance; quantity must match the goods-received note exactly, with a small allowance for agreed part-deliveries. The doc also names the approver per supplier (or per spending category), the spend threshold above which a second approver is required, and the quiet hours for notifications.
- Approvers. The people who actually sign off on paying suppliers — usually a manager or a buyer. Each approver has an email address. Bills land with the supplier name, the bill total, the per-line result against the PO and the goods-received note, the exact gap on any flagged line, and three buttons: Approve, Query, and Reject. A button click never pays the bill directly — it records a decision and, on approve, marks the bill ready for your normal payment run.
What runs on every bill (the inside)
- The bill intake. Three sources feed the matcher. Forward a supplier bill PDF to
bills@your-company.comand SES writes the raw email to S3. A small portal-poll Lambda checks supplier portals on a schedule and pulls down any new bills. A manual-upload lane lets someone drop a PDF straight into a folder. Whatever the source, the intake runs Textract to read the PDF and one Bedrock Haiku 4.5 call to turn the read text into clean, structured lines — item, quantity, unit price, line total — that the matcher can compare. - The matcher. Runs as soon as a bill is read. It finds the purchase order the bill refers to (by PO number on the bill, or by supplier and item if the number is missing), pulls the matching goods-received lines, and walks the bill line by line. For each line it checks three things: is this the right item, does the billed quantity match what the dock received, and does the billed price match what the PO agreed — each within the tolerance from the rules doc. It then picks one of four outcomes. Matched: every line agrees inside tolerance. Price variance: items and quantities are fine but a unit price is too high. Quantity variance: a billed quantity doesn’t match the goods received. No PO: there’s no purchase order to match against. The match itself is plain Python — no model decides whether to pay.
- The approval desk. Takes the outcome and sends the bill to the right approver. A clean match goes out as “ready to approve” with a single tap. A flagged bill goes out with the exact problem spelled out: “Line 3, item BOLT-12: billed 100 units, dock received 84.” Both carry Approve, Query, and Reject. Every decision writes a row to DynamoDB so the trail is auditable. A daily sweep re-surfaces any bill sitting unapproved past its due date so nothing quietly ages out. A monthly summary writes a short narrative: bills matched clean, value caught in variances, top suppliers by mismatch.
In plain words
A bill arrives from your packaging supplier for $4,200: 1,000 boxes at $4.20 each. The matcher finds purchase order PO-1182, which ordered 1,000 boxes at $3.80 each, and the goods-received note that says 1,000 boxes actually arrived. The quantity is fine. The price is not — the PO agreed $3.80 and the bill charges $4.20, a $400 overcharge across the line. The matcher marks the bill price variance and emails the buyer, Dan: “Packaging Co bill #5567 — price variance on boxes: billed $4.20/unit, PO agreed $3.80/unit, +$400 on the line. [bill PDF] [PO-1182]” with Approve, Query, and Reject. Dan taps Query; the matcher sends the supplier a templated note asking them to confirm the agreed price or reissue. The supplier reissues at $3.80. The corrected bill comes back in, matches clean, and Dan approves it with one tap. It joins the next payment run. The overcharge never got paid.
The cost of running this is about $3.20 a month at SMB volume. The cost of not running it is the steady drip of small overcharges nobody catches, the occasional duplicate paid twice, and the bill for goods that never fully arrived — each one small, all of them adding up.
Design rules that shaped every decision
- Every bill is checked three ways — against the purchase order and against what the dock actually received. Never one-sided.
- Four outcomes, always. Matched, price variance, quantity variance, no PO. There is no fifth.
- The match is plain Python against tolerances in a doc. No model decides whether a bill is right.
- It never pays a bill on its own. A human approves every payment, and the bill carries the exact gap.
- Tolerances and approvers live in a doc. Changing a price tolerance or a sign-off doesn’t need a deploy.
- Every decision is logged. Audit a payment next year and you can see who approved what, and why.
Why this shape
Most small teams check supplier bills in one of three ways: somebody eyeballs the total and pays it, somebody compares the bill to the purchase order but not to what arrived, or nobody checks at all and the bookkeeper pays whatever lands. Eyeballing the total misses a wrong unit price that still produces a plausible-looking number. Checking only the PO misses the 16 boxes that never showed up. And paying whatever lands is how a supplier’s “oops, our system double-billed you” becomes your money, permanently.
The setup above keeps the two documents you already produce — the purchase order and the goods-received note — as the source of truth, and adds a small system that reads every incoming bill and lines it up against both. Clean bills clear in one tap, so the matcher saves time on the 90% that are fine. The few that are off get caught with the exact line and the exact gap, while there’s still time to query the supplier. And nothing is ever paid automatically — the matcher proposes, a human decides.
The next four posts walk through each piece in turn: how a supplier bill gets read, how a bill gets matched three ways, how a mismatch reaches the right person, and how a bill gets approved for payment. One diagram per post. A cost breakdown and a final engineering reference at the end.
All posts