How a policy answer stays honest

Key takeaways

Off-limits topics (pay disputes, terminations, grievances, legal) are handed straight to a human.
The citation check makes sure the answer’s claims actually trace back to a pulled section.
The hedge check turns a thin answer into “ask HR” rather than a confident guess.
Every grounded answer ships with the exact section link, so the reader can verify it.
“I’m not sure, ask HR” is a first-class outcome, not a failure.

Four guardrails on every answer

Fig 4. Four guardrails between the draft and the staff member. Screen off-limits topics. Check every claim traces to a section. Catch thin answers and hedge to “ask HR.” Compose with the section link. Then reply, and log it.

Gate 1: the topic check

Some questions should never get an automated answer, no matter how good the docs are. “Can they fire me for this?” “I think my paycheck is wrong.” “How do I file a harassment complaint?” These are not handbook-lookups — they’re moments where a person needs a person. The rules doc has an off-limits list: pay disputes, terminations, grievances, anything that reads as a legal or safety question. Gate 1 checks the question against that list first. If it matches, the assistant doesn’t answer at all — it replies warmly and routes the person to the named human for that topic: “This is something to talk through with Priya in HR directly — here’s how to reach her.”

This gate runs first on purpose. There’s no value in a clever answer to a question the system shouldn’t be touching, and a confident wrong answer about a termination is the kind of mistake that ends up in a lawyer’s email. When in doubt, the topic check sends it to a human.

Gate 2: the citation check

The model was told to answer only from the pulled sections, but “told to” isn’t “guaranteed to.” Gate 2 verifies it. The answerer asks the model to return its answer with the supporting section id attached to each claim, then checks that each cited section is actually one of the ones that were pulled. If the draft makes a claim that doesn’t trace back to a pulled section — a sign the model leaned on its own general knowledge — the draft is rejected and the answer becomes “ask HR.”

This is the gate that turns the grounding from Part 3 into a promise rather than a hope. The search narrowed the model to a few sections; the citation check confirms the answer actually stayed inside them. An answer that can’t point to a section it came from doesn’t ship.

Gate 3: the hedge check

Sometimes the right sections came back, the model stayed grounded, but the answer is still shaky — the search confidence was middling, or the model itself wrote something like “it appears that…” or “this may depend on…”. Gate 3 reads those signals. A thin or hedged answer about company policy is worse than no answer, because the reader can’t tell the difference between “the handbook clearly says X” and “the assistant is guessing X.” So when the signals are weak, the gate downgrades the reply to an honest “I’m not certain this is covered — please check with HR,” optionally pointing at the closest section so the person has a head start.

The bar is set on purpose: the assistant should sound certain only when the docs make it certain. Everywhere else, it should sound like a careful colleague who says “let me not guess at that.”

Gate 4: compose with the section link, then ship

The answer survived the first three gates. Gate 4 dresses it for delivery. The voice doc sets the tone — short, kind, plain words, no legalese — and the gate formats the answer to match. Then it attaches the exact section link: a deep link back to the handbook section the answer came from, so the reader can open the source and confirm it in one click. Below that goes a small standing footer: “Not what you needed? Ask HR.” with the right contact. Even a perfect grounded answer offers the human path, because the assistant’s job is to help, not to be the last word.

Then it ships — back into the Slack thread or out as an email reply, whichever lane the question arrived on — and the whole exchange is written to a log. That log is quietly valuable: it’s the list of what staff actually ask, which questions hit “ask HR” most often (a sign the handbook has a gap), and which sections answer the most questions. Part 5 uses that log to keep the handbook sharp.

Why the guardrails exist

None of these gates are clever. They’re the care a thoughtful HR person would take if they were answering each message themselves — don’t touch the question that needs a real conversation, don’t state a rule you can’t point to, don’t pretend to be sure when you’re not, and always show your work. Writing them as four small sequential gates makes that care part of the system instead of something you hope the model remembers. The result is an assistant that’s genuinely useful on the easy 80% of questions and genuinely safe on the hard 20%, because the hard ones get a human.

Next post: how the handbook stays current — how a policy edit in Drive flows into the index within minutes, so the answerer never quotes a rule that changed last week.

All posts