How the FAQ answer gets drafted

Key takeaways

The drafter retrieves the help-doc passages closest to the question before any model runs.
The model answers using only those passages — it is told not to use outside knowledge.
Every claim must cite the source passage it came from; the citation rides along with the answer.
If nothing grounds the answer, the drafter returns “not covered” instead of guessing.
The answer is formatted to the voice doc — short, plain, in your house style — then proposed.

Four gates on every draft

Fig 4. Four gates between a candidate cluster and the proposed entry. Retrieve the passages. Draft using only them. Check the citation or refuse. Format to your voice. Then write the proposal to the review queue — or, if nothing grounded it, ask a human to write the answer.

Gate 1: retrieve the passages

Before the model sees anything, the drafter goes and finds the parts of your help docs that bear on the question. Your help docs were chunked into short passages and embedded once, up front, and they live in S3 Vectors alongside the question vectors. The drafter embeds the cluster’s representative question and queries for the closest passages — usually the top three to five. Each comes back with its text, the doc it’s from, and the section heading, so a citation can point at exactly where the answer came from.

This is the step that makes the whole thing grounded. The model never gets to roam your entire site or its own training; it gets a handful of passages from your docs and is asked to work within them. If your docs don’t cover the question, the retrieval comes back thin — and that’s a signal, not a problem to paper over.

Gate 2: draft using only those passages

Now the model runs. The drafter calls Claude Haiku 4.5 with the question and the retrieved passages, and a system prompt that’s blunt about the rules: “Answer the question using only the passages below. Do not use any outside knowledge. Keep it to two or three sentences. For each claim, name the passage it came from. If the passages don’t answer the question, reply exactly ‘NOT COVERED’.” Haiku is the cheap path and it’s plenty for short, grounded answers; the heavier model isn’t needed here because the thinking is “restate what the passage says, plainly,” not open-ended reasoning.

Keeping the answer short isn’t just style. A short answer is easier for a reviewer to check against the source, and a FAQ entry that runs three paragraphs usually means the question should have been split into two.

Gate 3: cite the source, or refuse

The model can still go off-script — claim something the passages don’t support, or cite a passage that doesn’t really say what the answer says. Gate 3 checks. It confirms the citation points at a passage that was actually retrieved, and that the cited passage plausibly supports the claim. If the model returned “NOT COVERED,” or the citation doesn’t map to a real passage, the cluster doesn’t become a proposal at all. Instead it goes to a human-write queue: “people keep asking this and the docs don’t answer it — someone should write the answer (and probably update the docs).”

That refusal path is the most important part of the whole system. A FAQ that quietly invents an answer for a gap in your docs is worse than no FAQ, because customers trust it. Sending the gap to a human keeps the published FAQ honest and surfaces the holes in your help docs as a useful byproduct.

Gate 4: format to your voice, then propose

A grounded, cited answer still has to sound like you. Gate 4 applies the voice doc: the tone (“friendly and direct”), the length cap, the formatting (a lead sentence then a short explanation), and any banned phrases (“simply,” “just,” anything that talks down to the reader). It then assembles the proposed entry — the question as a customer would phrase it, the answer, and a link to the source passage — and writes it to the review queue and the fb-proposals table. The source link stays attached all the way through, so the reviewer in Part 5 can click straight to the doc the answer came from.

For a cluster the grouper flagged as a refresh (an existing entry that keeps getting asked), the drafter does the same work but presents it as a diff: here’s the live answer, here’s the redrafted one, here’s what changed and why. The reviewer decides whether the update is an improvement.

Why grounding is the whole game

None of these gates are exotic. They’re the discipline a careful person would apply if they were writing the FAQ by hand: look up what your docs actually say, write only what you can back up, cite where it came from, and flag the questions your docs don’t answer instead of bluffing. Putting that discipline in code — retrieve, draft-from-passages, cite-or-refuse, format — makes it a property of the system, not something you’re hoping a model remembers to do on a given Tuesday.

Next post: how a proposed entry gets approved — the review queue, the three actions a reviewer can take, and how an approved entry reaches the live FAQ doc.

All posts