How replies work without making things up
The bot answers from the client’s docs only — or escalates to a human. Citation required, no exceptions.
AI reply bots have a bad reputation, and they’ve earned it. They quote prices that don’t exist, promise features that aren’t real, give medical or financial advice they shouldn’t. The fix isn’t to make the model bigger — it’s to take away its freedom to make things up.
Why webhook then queue
Facebook’s webhook expects a quick “got it” within a couple of seconds, or it’ll keep retrying the same message. So the receiver does the bare minimum — checks the message is real, drops it into a queue, and answers Facebook right away. The slow work (looking things up, calling the AI) happens after, off the critical path. If the AI happens to be slow that day, Facebook still gets its “got it” on time. And if a message fails partway through, it lands in a separate “something went wrong” queue you can look at later, instead of vanishing.
Why a confidence gate
The vector search returns the top relevant chunks ranked by similarity. If the top match is weak — below 0.6, say — that’s the system telling you “I don’t actually know this.” Forcing the model to answer anyway is exactly how hallucinations happen.
So the confidence gate is brutal: low score, the message lands in your inbox as an alert and gets logged in an “unanswered” list. That list becomes the next batch of FAQ entries the client should add to their Drive doc. The bot improves over time without anyone training a model.
Why a citation guardrail
Even when given the right context, models sometimes free-style anyway. So the instructions to the model are blunt:
- Answer using ONLY the provided context.
- Return both the reply text and the IDs of the chunks you actually used.
The Reply Lambda checks the response. No cited chunk = no send. The reply is logged but never reaches Facebook. This is the structural guarantee that the bot can’t invent things — if the model didn’t ground its answer in actual content from the client’s docs, the reply doesn’t go out.
Same denylist as posts
The reply pipeline reuses the page’s denylist from the posting pipeline. A reply on the SFC page can’t mention “XAUUSD,” even by accident. A reply on DailyScalper can’t talk about theology. The same per-page rules govern both directions of the conversation.
Start in draft mode
For every new client, replies start in draft mode by default: instead of going to Facebook, they go to your inbox. You review them in batches for a week or two. Once you trust the system on a category — say, pricing questions — you flip that category to auto-send. Broader categories like general FAQ come later.
The cost is a slower rollout. The benefit is that no theological or financial mishap reaches a real user before you’ve seen what the bot would have said.
All posts