How an expense claim gets checked against policy

Key takeaways

The checker runs the moment a claim is submitted — it’s event-driven, not batched.
Per-category limits live in the policy doc — meals $40/day, taxis $30/trip, software needs sign-off.
Four outcomes per claim: clear, confirm, review, reject.
DynamoDB tracks the running daily total per claimant so a stack of small meals can’t slip the cap.
The checker never calls a model on the limit comparison. That part is entirely deterministic.

The decision flow, per claim

Fig 3. The checker’s decision tree, per claim. Five steps decide which of four outcomes applies. The policy doc holds every limit; the checker only enforces them.

$40 meals isn’t magic, it’s in the doc

The policy doc has one short section per category. Each section names the rule in plain prose: “Meals: up to $40 per person per day, receipt required. Taxis and rideshare: up to $30 per trip, receipt required. Software and subscriptions: any amount, but needs a manager’s sign-off. Client entertainment: up to $150, needs finance sign-off above $80. Anything over $250 in any category goes to finance.” The numbers are the caps. The phrase “needs sign-off” is what tips a claim from a quick confirm into a full review.

The limits exist for a reason. The $40 meal cap is roughly a fair lunch and keeps the everyday claims flowing without a meeting. The software sign-off catches the seat that quietly renews at $400 a year. The $250 ceiling is the line above which a second set of eyes is always worth it. Different categories carry different risk; the limits reflect that.

Per-claimant overrides exist too. The policy doc can name a person or a role with a different cap — a field sales rep whose meals cap is $60 because they’re always on the road, say. The checker reads the override first and falls back to the category default. This is the right escape hatch for the people whose normal spend genuinely differs from the team’s.

Four outcomes, always

Every claim lands in exactly one of four buckets. The names are simple on purpose.

Clear. In policy, comfortably under the limit, receipt present. The manager gets a one-tap confirm card — they can approve in a tap, or open it if they want a closer look. Most everyday claims are clears.
Confirm. In policy, but worth showing the detail — close to the cap, or a category that always wants a glance. Same one-tap card, but the amount and receipt are front and centre so the approver isn’t approving blind.
Review. Over a limit, or a category that requires a sign-off. The full approval card goes to the right person — often finance rather than the line manager — with the reason spelled out: “$60 over the meals cap” or “software, sign-off required.”
Reject candidate. Clearly outside policy — a banned category, or a receipt missing where one is required. The system proposes a reject with the reason and a draft note back to the claimant. A human still confirms it; the system never rejects a claim on its own.

State that makes the decision deterministic

The checker reads one DynamoDB table as it works: ea-claims, which holds every claim and its running status. For the daily-total check it queries the claimant’s already-approved spend in the same category today, so three $15 lunches in one day are seen as $45 against a $40 cap, not three separate fine claims. With that one lookup, the decision logic is a few dozen lines of plain Python and zero magic. A given claim, with a given amount, a given category, and a given day’s history, always produces the same outcome. Re-running the checker produces the same result.

The category sort from Part 2 is the one place a model touched the claim, and that happened before the checker runs. By the time the checker compares an amount to a limit, the category is already settled and the math is pure arithmetic. Part 4 covers how the chosen outcome turns into a request to the right person.

Why the limit check uses no model

The checker could ask a model “does this seem reasonable?” It doesn’t. Two reasons. First, the policy check is the one part of the system that has to be utterly predictable — if the doc says meals cap at $40 and the claim is $36, it clears; if it’s $46, it goes to review. A model in that loop introduces variance nobody can reason about, and an expense decision the team can’t explain is a decision they won’t trust. Second, this runs on every single claim, and arithmetic is free while a model call is not.

Bedrock fires elsewhere — sorting the category in Part 2, and writing the monthly summary in Part 6. Not on the limit check. The checker is plain Python that reads a doc and writes an outcome.

Next post: how the chosen outcome finds the right approver, how quiet hours are honored, and what the approval card actually carries.

All posts