How the assistant answers
A visitor types a question. The assistant has a few seconds to do the right thing — and the “right thing” is rarely just “reply with words.” Sometimes it’s a clarifying question. Sometimes it’s a quiet handoff. Sometimes it’s “I don’t know — and I won’t guess.” Here’s how that decision gets made on every turn, and how a citation makes the difference between a real answer and a confident hallucination.
Key takeaways
- Search runs first, generation second. The current turn plus the scratchpad become a query against your Drive-backed knowledge base; only the retrieved passages are in scope for the reply.
- Strict tool_use, four tools per turn:
answer(citation required),clarify(one short follow-up),hand_off(out of remit),decline(off-topic or unsafe). - Citation is enforced twice: prompt forbids the answer tool without a supporting passage, and the runtime verifies the cited id was in the retrieved set before flushing.
- Below the confidence line, only clarify, hand off, and decline are allowed — the assistant leans toward asking or escalating when docs are thin, never inventing.
- If the model emits an answer with a citation outside the retrieved set, the runtime downgrades to
hand_off— the safer-by-default failure mode.
Four tools, one pick per turn
Every visitor message ends in one of four outcomes. The assistant picks exactly one. There is no “maybe both” or “answer and also escalate” — ambiguity in the routing is what makes assistants feel chaotic.
Search before generation
The first thing that happens on every turn is search, not generation. The current message and the last few turns from the scratchpad get turned into a query. The query goes against your knowledge folder — the help docs, FAQ, policies, and pricing pages you maintain in Drive. The result is a small list of passages that look most relevant to what the visitor just asked, each with a confidence score.
This order is deliberate. If the AI wrote first and searched second, it would lean on things from its training that aren’t true about your business — old prices, generic shipping times, made-up return rules. By making the AI read your passages first, the only facts it has are the facts you wrote.
The search itself is small. It runs against a managed knowledge base that follows your Drive folder; an edit to a help doc shows up in search results within minutes, no deploy. You don’t maintain an index, you don’t run a search database, you don’t schedule re-indexing — the managed piece handles all of that. What you do is keep your help docs accurate, the way you would for a new hire.
The four tools
Answer is the happy path. The retrieved passages cover the visitor’s question with high confidence. The AI is asked to write a short reply that cites the source — one or two sentences from the passage, stitched into a sentence that reads naturally. The reply streams back over the websocket so words appear within a second. A small “from: shipping policy” tag goes underneath, so the visitor can click through if they want the full version.
Clarify handles the case where the question is real but ambiguous. “Do you have any in stock?” without a product. “What time?” without a service. The AI is allowed to ask exactly one short follow-up — not three, not a form. After the visitor replies, the next turn flows through the same four tools again.
Hand off covers everything that’s on-topic but beyond what the docs can answer confidently. Refund disputes, custom orders, account-specific questions, anything where being wrong is more expensive than being slow. The AI tells the visitor a human will pick this up shortly; the next post is about how that handoff actually lands.
Decline is for the rest. Off-topic small talk, attempts to make the assistant write code or essays, hostile or unsafe asks. The reply is brief and polite (“I’m here to help with questions about [your business] — want me to grab a human if you have a different question?”) and the turn is logged for review.
Citation as a hard gate
The single most important rule — the one that separates a useful chat assistant from a confident liar — is this: no citation, no answer. If the AI can’t point to a passage from your docs that backs up what it’s about to say, the “answer” tool is off the table. The route falls through to clarify (if the question is real but your docs are thin) or hand off (if your docs don’t cover the topic at all).
The rule is enforced in two places. First, the AI is told to only use the answer tool when it has a supporting passage. Second, the system double-checks that the cited passage really was in the search results before sending the reply. If the AI tries to cite something it didn’t actually retrieve, the system catches it and switches to hand off instead. Belt and braces — because “please don’t make things up” is fine advice, but advice that isn’t enforced fails when it matters.
Confidence and routing
Each retrieved passage comes with a score. Above a chosen line, the answer tool is allowed. Below it, only clarify, hand off, and decline are. This means the assistant leans toward asking or handing off when the docs are thin — which is exactly the lean you want. A visitor who gets a clarifying question feels heard. A visitor who gets a wrong answer feels lied to.
Picking that line is a slider, not a science. Set it cautious on day one (more handoffs, fewer answers) and loosen it as your gaps log fills up — the next post is about handoffs, and the one after that is about turning weekly gap reviews into better answers.
All posts