How the assistant answers

Four tools, one pick per turn

Every visitor message ends in one of four outcomes. The assistant picks exactly one. There is no “maybe both” or “answer and also escalate” — ambiguity in the routing is what makes assistants feel chaotic.

Fig 3. One tool per turn. The decision is the first thing the AI does — the words come second.

Search before generation

The first thing that happens on every turn is search, not generation. The current message and the last few turns from the scratchpad get turned into a query. The query goes against your knowledge folder — the help docs, FAQ, policies, and pricing pages you maintain in Drive. The result is a small list of passages that look most relevant to what the visitor just asked, each with a confidence score.

This order is deliberate. If the AI generated first and searched second, it would “know” things from training that aren’t in your docs — old prices, generic shipping windows, made-up return policies. By making the AI read from your passages first, the only ground truth it has is the ground truth you wrote.

The search itself is small. It runs against a managed knowledge base that watches your Drive folder; an edit to a help doc shows up in search results within minutes, no deploy. You don’t maintain an index, you don’t run a vector store, you don’t schedule re-indexing — the managed piece does that. What you do is keep your help docs accurate, the way you would for new staff.

The four tools

Answer is the happy path. The retrieved passages cover the visitor’s question with high confidence. The AI is asked to write a short reply that cites the source — one or two sentences from the passage, stitched into a sentence that reads naturally. The reply streams back over the websocket so words appear within a second. A small “from: shipping policy” tag goes underneath, so the visitor can click through if they want the full version.

Clarify handles the case where the question is real but ambiguous. “Do you have any in stock?” without a product. “What time?” without a service. The AI is allowed to ask exactly one short follow-up — not three, not a form. After the visitor replies, the next turn flows through the same four tools again.

Hand off covers everything that’s on-topic but beyond what the docs can answer confidently. Refund disputes, custom orders, account-specific questions, anything where being wrong is more expensive than being slow. The AI tells the visitor a human will pick this up shortly; the next post is about how that handoff actually lands.

Decline is for the rest. Off-topic small talk, attempts to make the assistant write code or essays, hostile or unsafe asks. The reply is brief and polite (“I’m here to help with questions about [your business] — want me to grab a human if you have a different question?”) and the turn is logged for review.

Citation as a hard gate

The single most important rule, the one that separates a useful chat assistant from a confident liar, is this: no citation, no answer. If the AI cannot point to a passage from your docs that supports what it’s about to say, the “answer” tool is not allowed. The route falls through to clarify (if the question is genuine but the docs are sparse) or hand off (if the docs simply don’t cover this topic).

This is enforced at two levels. The AI is instructed to only emit the answer tool when it has identified a supporting passage; and the runtime checks that the cited passage exists in the retrieved set before sending the reply. If the AI tries to cite a passage it didn’t actually retrieve, the system catches it and routes to hand off instead. Belt and braces — because “please don’t make things up” is good guidance, but unenforced guidance fails when it matters.

Confidence and routing

Each retrieved passage comes with a score. Above a threshold, the answer tool is on the table. Below it, only clarify, hand off, and decline are. This means the assistant is structurally biased toward asking or escalating when the docs are weak — which is exactly the bias you want. A visitor who gets a clarifying question feels heard. A visitor who gets a wrong answer feels lied to.

Tuning that threshold is a slider, not a science. Set it conservative on day one (more handoffs, fewer answers) and loosen it as your gaps log fills up — the next post in this series talks about both, and the post after that is about turning weekly gap reviews into better answers.

All posts