Part 3 of 7 · Website chat assistant series ~5 min read

How the assistant answers

A visitor types a question. The assistant has a few seconds to do the right thing — and the “right thing” is rarely just “reply with words.” Sometimes it’s a clarifying question. Sometimes it’s a quiet handoff. Sometimes it’s “I don’t know — and I won’t guess.” Here’s how that decision gets made on every turn, and how a citation makes the difference between a real answer and a confident hallucination.

Four tools, one pick per turn

Every visitor message ends in one of four outcomes. The assistant picks exactly one. There is no “maybe both” or “answer and also escalate” — ambiguity in the routing is what makes assistants feel chaotic.

A visitor turn flows through search and into one of four tools Top: a single box, "Visitor turn (with last few turns from scratchpad)." An arrow flows down to a "Search your knowledge" box, which fans out to "Top passages with confidence scores." That feeds into a central decision diamond, "AI picks one tool." Four arrows come out of the diamond, each leading to a tool box arranged across the bottom. Tool one: "Answer" — labelled "high confidence, citation present." Tool two: "Clarify" — labelled "ambiguous question, ask one short follow-up." Tool three: "Hand off" — labelled "out of remit or low confidence on a high-stakes topic." Tool four: "Decline" — labelled "off-topic, unsafe, or hostile." Below all four tools, a single rule banner: "No citation, no auto-answer. Confidence below threshold routes to clarify or hand off, never to answer." Visitor turn + last few turns from scratchpad Search your knowledge top passages, with confidence scores AI picks one tool Tool 1 Answer citation present Tool 2 Clarify one short follow-up Tool 3 Hand off out of remit Tool 4 Decline off-topic or unsafe No citation, no auto-answer. Below threshold routes to clarify or hand off — never to answer.
Fig 3. One tool per turn. The decision is the first thing the AI does — the words come second.

Search before generation

The first thing that happens on every turn is search, not generation. The current message and the last few turns from the scratchpad get turned into a query. The query goes against your knowledge folder — the help docs, FAQ, policies, and pricing pages you maintain in Drive. The result is a small list of passages that look most relevant to what the visitor just asked, each with a confidence score.

This order is deliberate. If the AI generated first and searched second, it would “know” things from training that aren’t in your docs — old prices, generic shipping windows, made-up return policies. By making the AI read from your passages first, the only ground truth it has is the ground truth you wrote.

The search itself is small. It runs against a managed knowledge base that watches your Drive folder; an edit to a help doc shows up in search results within minutes, no deploy. You don’t maintain an index, you don’t run a vector store, you don’t schedule re-indexing — the managed piece does that. What you do is keep your help docs accurate, the way you would for new staff.

The four tools

Answer is the happy path. The retrieved passages cover the visitor’s question with high confidence. The AI is asked to write a short reply that cites the source — one or two sentences from the passage, stitched into a sentence that reads naturally. The reply streams back over the websocket so words appear within a second. A small “from: shipping policy” tag goes underneath, so the visitor can click through if they want the full version.

Clarify handles the case where the question is real but ambiguous. “Do you have any in stock?” without a product. “What time?” without a service. The AI is allowed to ask exactly one short follow-up — not three, not a form. After the visitor replies, the next turn flows through the same four tools again.

Hand off covers everything that’s on-topic but beyond what the docs can answer confidently. Refund disputes, custom orders, account-specific questions, anything where being wrong is more expensive than being slow. The AI tells the visitor a human will pick this up shortly; the next post is about how that handoff actually lands.

Decline is for the rest. Off-topic small talk, attempts to make the assistant write code or essays, hostile or unsafe asks. The reply is brief and polite (“I’m here to help with questions about [your business] — want me to grab a human if you have a different question?”) and the turn is logged for review.

Citation as a hard gate

The single most important rule, the one that separates a useful chat assistant from a confident liar, is this: no citation, no answer. If the AI cannot point to a passage from your docs that supports what it’s about to say, the “answer” tool is not allowed. The route falls through to clarify (if the question is genuine but the docs are sparse) or hand off (if the docs simply don’t cover this topic).

This is enforced at two levels. The AI is instructed to only emit the answer tool when it has identified a supporting passage; and the runtime checks that the cited passage exists in the retrieved set before sending the reply. If the AI tries to cite a passage it didn’t actually retrieve, the system catches it and routes to hand off instead. Belt and braces — because “please don’t make things up” is good guidance, but unenforced guidance fails when it matters.

Confidence and routing

Each retrieved passage comes with a score. Above a threshold, the answer tool is on the table. Below it, only clarify, hand off, and decline are. This means the assistant is structurally biased toward asking or escalating when the docs are weak — which is exactly the bias you want. A visitor who gets a clarifying question feels heard. A visitor who gets a wrong answer feels lied to.

Tuning that threshold is a slider, not a science. Set it conservative on day one (more handoffs, fewer answers) and loosen it as your gaps log fills up — the next post in this series talks about both, and the post after that is about turning weekly gap reviews into better answers.

All posts