A social inbox unifier on AWS for a few dollars a month
A small business gets messaged in more places than anyone can keep open at once. The Instagram DM asking if you ship overseas. The Facebook message about a refund. The WhatsApp from a regular who wants to book again. The comment that turned into a private message. Five apps, five sets of notifications, and a real chance that the message that mattered most got buried under the one that didn’t. This post walks through the design of a small system that pulls all of it into one shared queue, labels each message so the urgent ones rise to the top, drops duplicates, and hands each one to the right teammate — while a human still writes every reply.
Key takeaways
- Every platform’s DMs flow into one shared queue through small per-platform connectors.
- One cheap model call labels each message by topic, urgency, and language — nothing more.
- Duplicates are dropped by fingerprint, so a re-sent webhook never opens a second thread.
- Each message is routed to the right teammate by topic, working hours, and fair load-sharing.
- A human writes every reply. The system gathers, labels, and routes — it never answers.
- Designed on AWS for about $3/month at typical small-business volume.
The whole system on one page
Before any code, here’s the shape of what we’re designing.
What you set up once (the outside)
- Platforms. Connect each channel you use — an Instagram account, a Facebook Page, a WhatsApp business number — so each one sends new messages to your system. Most of these platforms do this the same way: you give them a web address (a “webhook” — just a URL they call whenever something new happens), and from then on every incoming DM gets pushed to that address within seconds. You connect each platform once and forget it. Adding a new channel later is one more connector, covered in Part 2.
- Rules and routing. A small set of config the team can edit without a deploy. It holds the routing rules (“refund questions go to billing; shipping questions go to ops”), the team roster with each person’s topics and working hours, the urgency targets (how fast each level should be answered), and the message templates a teammate can pick from to start a reply. None of these write the reply for you — they just save a person from typing the same opening line for the fiftieth time.
- Team. The people who answer. Each teammate has a seat in the shared inbox and a short profile: which topics they handle, their working hours, and a backup teammate for when they’re off. Threads land in their queue already labeled and sorted, so they open the most urgent one first instead of scrolling five apps to find it.
What runs on every message (the inside)
- The connectors. One small piece of code per platform. When a platform calls the webhook, the connector first checks the call is genuine (each platform signs its calls; the connector verifies the signature so a stranger can’t inject fake messages). Then it turns that platform’s particular format into one common shape — sender, platform, text, timestamp, conversation id — and drops it on a single work queue. After that point, the rest of the system doesn’t care which app the message came from.
- The labeler. Pulls each message off the queue. First it builds a fingerprint and checks whether this exact message already arrived; if so, it drops the duplicate and links it to the existing thread. If it’s new, one Bedrock Haiku 4.5 call reads the text and returns a small label set: topic, urgency (urgent, normal, or low), language, and a one-line summary. That’s the only model call in the whole system, and it never writes a reply — it only describes the message so the routing and sorting can be smart.
- The shared inbox. Takes the labeled message and decides whose queue it belongs in, then shows it as one clean thread. A teammate opens the thread, reads the full back-and-forth, and types the reply. On send, the reply goes back out through the same platform the customer used. The thread can be reassigned or closed. A daily digest summarizes what came in, what’s still open, and anything that sat too long.
In plain words
A customer DMs your Instagram at 9:14am: “Hi, my order #4821 still hasn’t shipped and I leave for a trip Friday — can you check?” The connector verifies the call and drops it on the queue. The labeler reads it: topic shipping, urgency urgent (there’s a deadline), language English. The routing sends it to Sam in ops, who’s on shift, with the thread already open and the order number highlighted. Sam checks the system, sees the order shipped overnight, and types a reply with the tracking link. It goes back out as an Instagram DM under your brand. The same customer also messaged your Facebook Page with the same question ten minutes later — the fingerprint check spots the duplicate and links it to Sam’s thread instead of opening a second one, so nobody answers the same person twice.
The cost of running this is about $3 a month at SMB volume. The cost of not running it is the DM that sat unread for two days because it landed in the one app nobody had open — and the customer who quietly went somewhere else.
Design rules that shaped every decision
- A human writes every reply. The system gathers, labels, and routes — it never auto-answers.
- One common message shape. Past the connector, no piece of the system cares which app it came from.
- One cheap model call per message, for labels only. Routing and dedupe are plain code.
- Duplicates are dropped by fingerprint, so one conversation is ever only one thread.
- The rules live in config. Changing routing, hours, or a template doesn’t need a deploy.
- Every action is logged. You can see who answered what, and when, months later.
Why this shape
Most small teams handle social messages one of three ways: each person watches one or two apps, somebody checks all the apps a few times a day, or a shared phone gets passed around. The first way means messages get missed whenever the watcher is busy. The second is slow by design — an urgent question waits hours for the next sweep. The third is a coordination mess: two people answer the same DM, or each assumes the other did, and nobody does.
The setup above leaves the customer-facing channels exactly where they are — people keep messaging your Instagram and your Facebook like always — but adds a small system that watches every channel at once and puts the one queue in front of the team. Each message arrives labeled, so the urgent one rises to the top. Each is routed, so the right person sees it without anyone triaging by hand. Duplicates collapse into one thread. And critically, the system stops there: it never writes or sends a reply. A person reads, decides, and answers in their own words. The machine just makes sure the message reached them.
The next four posts walk through each piece in turn: how a DM reaches the unified inbox, how it gets labeled and deduped, how it finds the right teammate, and how it gets answered by a human. One diagram per post. A cost breakdown and a final engineering reference at the end.
All posts