How the responder reads a review

Key takeaways

Three extractors run in parallel against every review: rating with sentiment cross-check, themes, and specifics.
The rating extractor flags mismatches — a 5-star with strongly negative text, or a 1-star whose body is a typo of “love it”.
The themes extractor tags against a small business-specific list (food, service, cleanliness, value, etc.) plus an open-form bucket for anything novel.
The specifics extractor pulls names, dishes or products, dates, dollar amounts, order numbers, and which branch.
Each extractor returns a 0–1 confidence score; high confidence on all three is the gate to auto-reply, anything borderline becomes a reason to draft.

Three extractors in parallel

Fig 3. Three extractors run in parallel against the same review. Each one returns its piece with a confidence score; the move-picker reads all three before it decides anything.

Why sentiment, not just stars

The platform-given score is a useful signal but not a complete one. The clearest counterexample is a 5-star review whose body says “food was cold, server was rude, will not be back — the only good thing was the door wasn’t locked.” That happens. So does the inverse: a 1-star review whose body reads “Five stars, sorry hit the wrong button.” Auto-replying to either based on the score alone is a small disaster — a chirpy “thanks for the kind words!” under a complaint, or a confused apology under a happy customer.

The cross-check is cheap. Read the score, run the text through a small sentiment classifier, and if they don’t agree, raise a mismatch flag. The flag doesn’t decide anything by itself; it’s a signal the move-picker uses to push the review out of the auto-reply lane. A draft for your eyes is the right move when the rating and the text disagree, no matter which direction the disagreement runs.

Themes that come back as tags

Themes are the categories a review touches: for a restaurant, that’s typically food quality, service speed, value, cleanliness, atmosphere, and staff friendliness. The list lives in the same Drive folder as the voice file — small, business-specific, and editable without a deploy. The extractor multi-tags: a single review about “great food, but we waited an hour” lights up food-quality (positive) and service-speed (negative), and the composer can address both in the reply.

An open-form bucket catches anything the pre-defined list doesn’t cover — “the parking situation”, “the new menu format”, “your dog-friendly patio”. Open-form themes don’t drive auto-reply (the responder won’t commit to addressing something it doesn’t have a policy for), but they do feed the themes log. After a quarter, the open-form bucket tells you what your customers are talking about that you haven’t named yet.

Specifics — names, dishes, dates, branches

The specifics extractor pulls the named bits a real human reply would acknowledge: a staff member’s first name (“Sarah was wonderful”, “Alex didn’t apologize”), a dish or product (“the soup”, “the new mug we ordered”), a numeric like an order number or a dollar figure (“they wouldn’t honour the $50 promo”), a date or time of visit, and — for a multi-location business — which branch the review was about.

These have two distinct uses. First, the composer reads them, so a draft can say “thanks for mentioning Sarah” instead of “thanks for mentioning our staff.” Second, the move-picker uses them as routing signals: a review that names a current staff member by first name is a draft (so a human reads it before it auto-posts under the business’s account), and a review that mentions a specific dollar amount or order number is also a draft (because compose-with-numbers is exactly where the AI most wants to make up the wrong figure).

Names are filtered against a roster — a small list of current staff first names in the policies file. “Sarah” mentioned but not on the roster is treated as a regular noun; “Alex” on the roster is flagged. The roster lives in Drive next to the voice file; updating it when someone joins or leaves is one edit, no deploy.

Confidence is the gate

Each extractor returns a confidence score between zero and one. The move-picker won’t auto-reply unless all three are above a configured threshold — a single low score is enough to push the review into the draft lane.

That’s the entire role of confidence in this system: a vote on whether the responder is sure enough to act without you reading first. It is not a measure of correctness, and the responder doesn’t pretend it is. It’s a humility threshold, deliberately tuned so the cost of a low-confidence false positive (you having to glance at one more draft) is much lower than the cost of a high-confidence false positive (a tone-deaf reply posted publicly under your business’s name).

What this hands to the next post

The move-picker now has a structured review with a score, a sentiment, a list of themes, a small bag of specifics, and a confidence on each piece. The next post is what it does with that — how it picks one of four moves, and why the boundaries between them are deliberately conservative.

All posts