How a churn score gets calculated

Key takeaways

The scorer runs once a week via EventBridge Scheduler at 8am local time.
Each signal becomes points using weights in the rules doc — order gap, pace drop, sour mood, login gap.
Four bands per customer, every run: steady, watch, at-risk, churning.
DynamoDB tracks last week’s score and any recent contact so the same name isn’t flagged twice in a row.
The scorer never calls a model. The math is fully deterministic and shown on every flag.

The scoring flow, per customer

Fig 3. The scorer’s flow, per customer, per weekly run. Five steps decide which of four bands applies. The rules doc holds every weight and cut-off; the scorer only adds them up.

Signals into points: the weights are in the doc

The rules doc has one short section per signal. Each names the rule in plain prose: “Order gap: zero points until they pass their usual order interval, then one point per extra day, up to forty. Order-pace drop: twenty points if their pace has halved versus their own history. Support mood: fifteen points for sour, five for flat, zero for happy. Login gap: ten points if no login in thirty days.” The points add to a total out of one hundred. The cut-offs that follow — steady, watch, at-risk, churning — are also in the doc, as plain numbers like “watch at 30, at-risk at 50, churning at 75.”

The weights exist for a reason, and they’re tuned to your business. A coffee roaster whose customers order weekly should weight the order gap heavily. A software tool whose customers pay monthly but use it daily should weight the login gap more. The doc is where you say which signals matter for your customers, in numbers anyone can read.

Per-customer overrides exist too. The list sheet has an optional column called never_flag. Set it for a customer and the scorer still computes their score for the record, but never surfaces them — the right escape hatch for the marquee account whose owner insists on watching it personally.

Four bands, always

Every customer, every run, lands in exactly one of four bands. The names are simple on purpose.

Steady. The total is below the watch cut-off, or the customer was contacted within the pause window. Nothing to do. Most customers, most weeks, are steady.
Watch. The total crossed the watch cut-off but not the at-risk line. The score and reason are recorded to the cp-state DynamoDB table, but the customer is not put on this week’s list. Watch is the “keep an eye on this” band — it shows up in the monthly summary, not the weekly nudge.
At-risk. The total crossed the at-risk line. The customer is a candidate for this week’s list, ranked by score. The reason — the specific points that pushed them over — is recorded so the hand-off can show it.
Churning. The total is above the churning cut-off — already drifting hard. These go to the top of the weekly list. A churning customer with high monthly value is exactly who an owner wants to see first thing Monday.

State that makes the score fair

The scorer reads one DynamoDB table every run. cp-state records each customer’s latest run: (customer_id, score, band, reason, surfaced_date, last_contact). With that one table, the band decision and the “was this customer just contacted?” check are a few dozen lines of Python and zero magic. A given customer with given signals and a given contact history always produces the same band. And because the table remembers who was surfaced and contacted, the same name doesn’t crowd the list two weeks running while the owner is mid-conversation.

When an owner records an outcome — reached out, won back, lost — that writes back to cp-state and resets the relevant fields. Part 5 covers the outcome flow in detail.

Why the score uses no model

The scorer could call a model to produce a cleverer risk number. It doesn’t. Two reasons. First, the score has to be something a human can check and argue with — if the doc says a sour ticket is worth fifteen points, then a sour ticket is worth fifteen points, and the owner can see exactly why a name is on the list. A model in that loop turns a number people trust into a number people ignore. Second, the whole list of customers gets scored every week, and most are steady, so a model call per customer would be money spent to reproduce arithmetic.

Bedrock fires elsewhere — on the support-mood lane in Part 2, where it reads a ticket’s mood into a single number, and on the weekly reason write-up and the monthly summary in Part 6. Not on the score. The scorer itself is plain Python that reads a doc and does sums you could redo on paper.

Next post: how the at-risk list reaches the right owner, how the weekly cap and quiet timing are honored, and what a plain reason looks like next to each name.

All posts