Part 4 of 7 · FAQ builder series ~5 min read

How the FAQ answer gets drafted

The grouper handed over a candidate cluster — a question people keep asking, not yet answered. Now the drafter has to write a clear answer. The temptation is to just ask a model the question and publish whatever comes back. That’s how a FAQ ends up confidently wrong. Instead, the drafter pulls the relevant passages out of your own help docs, asks the model to answer using only those passages, makes it cite the source, and accepts “not covered” as a valid answer. Four small gates sit between the cluster and the proposed entry.

Key takeaways

  • The drafter retrieves the help-doc passages closest to the question before any model runs.
  • The model answers using only those passages — it is told not to use outside knowledge.
  • Every claim must cite the source passage it came from; the citation rides along with the answer.
  • If nothing grounds the answer, the drafter returns “not covered” instead of guessing.
  • The answer is formatted to the voice doc — short, plain, in your house style — then proposed.

Four gates on every draft

Four gates between a candidate cluster and the proposed entry A horizontal flow diagram. On the far left, a "Candidate cluster" box: the grouper marked this cluster ready, with the representative question and the times it was asked. Four gate columns sit in a row to the right, each drawn as a vertical bar. Gate 1: Retrieve passages — embeds the representative question and queries S3 Vectors over the help-doc chunks, returning the top few passages with their source doc and section. Gate 2: Draft grounded — calls Claude Haiku 4.5 with only those passages and the question, with a system prompt that says answer using only the passages, do not use outside knowledge, keep it short. Gate 3: Cite or refuse — checks that the draft cites a source passage for its claims; if the model returned not-covered or the citation doesn't map to a real passage, the cluster is routed to a human-write queue instead of being proposed. Gate 4: Format to voice — applies the voice doc's tone, length, and any banned phrases, attaches the question, the answer, and the source link. After all four gates pass, the proposed entry is written to the review queue and the fb-proposals table. A note at the bottom: a draft with no source never becomes a proposal — it becomes a request for a human to write one. Candidate representative question + times asked Gate 1 Retrieve passages query S3 Vectors over help docs top few chunks keep source doc and section Gate 2 Draft grounded Haiku 4.5 with passages only no outside knowledge, keep it short Gate 3 Cite or refuse claim maps to a real passage? if not, send to a human writer Gate 4 Format to voice tone + length from voice doc + question, answer, source link Proposed entry — question, grounded answer, cited source written to the review queue and the fb-proposals table a cluster with no grounded answer becomes a human-write request instead A draft with no source never becomes a proposal — it becomes a request for a human to write one.
Fig 4. Four gates between a candidate cluster and the proposed entry. Retrieve the passages. Draft using only them. Check the citation or refuse. Format to your voice. Then write the proposal to the review queue — or, if nothing grounded it, ask a human to write the answer.

Gate 1: retrieve the passages

Before the model sees anything, the drafter goes and finds the parts of your help docs that bear on the question. Your help docs were chunked into short passages and embedded once, up front, and they live in S3 Vectors alongside the question vectors. The drafter embeds the cluster’s representative question and queries for the closest passages — usually the top three to five. Each comes back with its text, the doc it’s from, and the section heading, so a citation can point at exactly where the answer came from.

This is the step that makes the whole thing grounded. The model never gets to roam your entire site or its own training; it gets a handful of passages from your docs and is asked to work within them. If your docs don’t cover the question, the retrieval comes back thin — and that’s a signal, not a problem to paper over.

Gate 2: draft using only those passages

Now the model runs. The drafter calls Claude Haiku 4.5 with the question and the retrieved passages, and a system prompt that’s blunt about the rules: “Answer the question using only the passages below. Do not use any outside knowledge. Keep it to two or three sentences. For each claim, name the passage it came from. If the passages don’t answer the question, reply exactly ‘NOT COVERED’.” Haiku is the cheap path and it’s plenty for short, grounded answers; the heavier model isn’t needed here because the thinking is “restate what the passage says, plainly,” not open-ended reasoning.

Keeping the answer short isn’t just style. A short answer is easier for a reviewer to check against the source, and a FAQ entry that runs three paragraphs usually means the question should have been split into two.

Gate 3: cite the source, or refuse

The model can still go off-script — claim something the passages don’t support, or cite a passage that doesn’t really say what the answer says. Gate 3 checks. It confirms the citation points at a passage that was actually retrieved, and that the cited passage plausibly supports the claim. If the model returned “NOT COVERED,” or the citation doesn’t map to a real passage, the cluster doesn’t become a proposal at all. Instead it goes to a human-write queue: “people keep asking this and the docs don’t answer it — someone should write the answer (and probably update the docs).”

That refusal path is the most important part of the whole system. A FAQ that quietly invents an answer for a gap in your docs is worse than no FAQ, because customers trust it. Sending the gap to a human keeps the published FAQ honest and surfaces the holes in your help docs as a useful byproduct.

Gate 4: format to your voice, then propose

A grounded, cited answer still has to sound like you. Gate 4 applies the voice doc: the tone (“friendly and direct”), the length cap, the formatting (a lead sentence then a short explanation), and any banned phrases (“simply,” “just,” anything that talks down to the reader). It then assembles the proposed entry — the question as a customer would phrase it, the answer, and a link to the source passage — and writes it to the review queue and the fb-proposals table. The source link stays attached all the way through, so the reviewer in Part 5 can click straight to the doc the answer came from.

For a cluster the grouper flagged as a refresh (an existing entry that keeps getting asked), the drafter does the same work but presents it as a diff: here’s the live answer, here’s the redrafted one, here’s what changed and why. The reviewer decides whether the update is an improvement.

Why grounding is the whole game

None of these gates are exotic. They’re the discipline a careful person would apply if they were writing the FAQ by hand: look up what your docs actually say, write only what you can back up, cite where it came from, and flag the questions your docs don’t answer instead of bluffing. Putting that discipline in code — retrieve, draft-from-passages, cite-or-refuse, format — makes it a property of the system, not something you’re hoping a model remembers to do on a given Tuesday.

Next post: how a proposed entry gets approved — the review queue, the three actions a reviewer can take, and how an approved entry reaches the live FAQ doc.

All posts