How a DM gets labeled and deduped

Key takeaways

Every message gets a fingerprint from its sender, platform, and a hash of the text.
A short-lived DynamoDB table remembers recent fingerprints; a repeat is dropped and linked, not re-opened.
One Bedrock Haiku 4.5 call returns topic, urgency, language, and a one-line summary — nothing else.
The model is asked for labels only. It never drafts or sends a reply.
Labeled, deduped messages move on to routing; duplicates just attach to the existing thread.

The decision flow, per message

Fig 3. The labeler’s flow, per message off the queue. Fingerprint first, so a duplicate never opens a second thread; then one model call for labels; then sort by urgency and hand off to routing. The model describes the message — it never answers it.

Dedupe first, because a copy is worse than nothing

Duplicates are common, and they come from two everyday situations. The first is the platform re-sending the same webhook — if a connector is slow to reply for a moment, the platform assumes the call failed and tries again, so the exact same message arrives twice. The second is a customer who, getting no instant answer, messages you again on a different channel: the same person, the same question, on Instagram and on Facebook five minutes apart. Both should land as one thread, not two, because two threads means two teammates might both reply, or each assumes the other will.

So before anything else, the labeler builds a fingerprint: a short code computed from the sender, the platform, and a hash of the message text. (A hash is just a fixed-length stamp of the text — same text in, same stamp out.) It checks that fingerprint against a small DynamoDB table that remembers recent fingerprints for a short window — long enough to catch re-sends and same-day cross-channel repeats, short enough that a genuinely new question next month is never mistaken for an old one. If the fingerprint is already there, the message is a duplicate: it’s attached to the existing thread as “also messaged on Facebook” and goes no further. If it’s new, the fingerprint is recorded and the message continues.

One model call, for labels only

A new message then gets exactly one Bedrock Haiku 4.5 call. The prompt is short and strict: “Read this message. Return JSON only with four fields — topic (one of the configured topics), urgency (urgent, normal, or low), language, and a one-line summary. Do not write a reply. Do not suggest a reply.” A few hundred input tokens, a few dozen out. That’s the entire AI footprint of the system on the hot path.

The labels do real work downstream. topic drives routing — a refund question goes to billing, a shipping question goes to ops. urgency drives the sort order, so the message with a deadline rises above the one that just says “thanks.” language lets the inbox flag a Spanish message for a teammate who handles Spanish, or show a translation hint. The one-line summary is what a teammate sees in the queue list before opening the thread, so they can pick the right one to work first.

Why the model only labels — and never answers

This is the line the system never crosses. The model is allowed to describe a message; it is never allowed to respond to one. There are two reasons, and they reinforce each other. The first is trust: a customer-facing reply in your brand’s voice is something a person should own, word for word. A wrong auto-reply — a refund promised that policy doesn’t allow, a tone that misreads an upset customer — is far more damaging than a message that waited ten extra minutes for a human. The second is cost and simplicity: labeling is a tiny, predictable call, while drafting and checking replies would be a much larger, riskier loop. Keeping the model on labels makes the whole system cheap, fast, and easy to reason about.

What if a label is wrong

Labels are a starting point, not a verdict. The shared inbox shows the topic and urgency as editable tags. If the model called something “normal” that a teammate sees is clearly urgent, they bump it with one click, and the thread re-sorts and can re-route. Those corrections are logged. Over time they’re a useful signal about which topics the labeling tends to misjudge — the kind of thing you tune in the prompt or the topic list, not by adding more model calls. The point is that a human is always in a position to overrule the machine, on every message.

Next post: how a labeled, deduped message finds the right teammate — routing by topic, working hours, fair load-sharing, and a backup for when the right person is off.

All posts