How a handoff to a human works
The handoff is the most important part of a chat assistant, and the part most assistants get wrong. The bad version makes the visitor repeat themselves to a human; the worse version drops the transcript on the floor; the worst version pretends to escalate and never does. Here’s how to design a clean one.
Key takeaways
- Four steps in fixed order: tell the visitor with a realistic window, package the transcript, deliver to one destination, hold the session for follow-up turns.
- The visitor is told before the backend work starts — honest window (“within business hours,” not “within an hour” at 9pm Friday) and human language, not “a ticket has been created.”
- The package always carries five things: full transcript (no summary substitute), one-line AI-written summary, page URL, contact, and the reason for handoff (out of remit, low confidence, explicit request, or unsafe).
- One destination per business — inbox, Slack, or shared queue, never both. Two destinations means two people see it, both assume the other will reply, and neither does.
- The websocket stays open for a couple more turns to catch “actually never mind, I figured it out” — appended as a follow-up note so the human knows not to send a wasted reply.
What a handoff has to do, in order
From the moment the assistant picks the “hand off” tool, four things have to happen, in this order, before the visitor closes the tab:
Step 1: tell the visitor first
The very first thing that happens, before any backend work, is that the assistant streams a short, plain line back to the visitor: “Got it — I’ll have a human follow up. We usually reply within business hours.” Two things matter here. The window is honest (not “within an hour” if it’s 9pm Friday) and the language is human (not “a ticket has been created”).
If the assistant doesn’t already have a way to reach the visitor — no email captured earlier, no phone — this is the moment to ask, in the same chat. One question: “What’s the best email to follow up on?” Not a form, not a popup. Just one line in the chat the visitor can reply to. The point is to make “getting in touch” feel like part of the conversation, not a fee for getting an answer.
Step 2: package the transcript
Once the visitor knows a human is coming, the cloud assembles a small payload. The contents are unglamorous but specific:
- The full transcript — every visitor turn and every assistant turn from this session. Not a summary, not the “important parts.” The whole thing, so the human can read it the way the visitor experienced it.
- A one-line summary — the AI’s last useful job. What did the visitor want? “Visitor asked about a refund on order #12345 placed two weeks ago, after we couldn’t find a matching policy.” This is what the human reads in the notification preview.
- Where they were — the page URL the chat opened on. “On the pricing page” is meaningfully different from “on the contact page.”
- How to reach them — the email or phone they shared, or a session reference if they’re still on the page.
- Why the handoff happened — one of: out of remit, low confidence, explicit request from the visitor, or hostile/unsafe message. The team uses this to spot patterns.
Step 3: deliver to one place
The package goes to exactly one destination. The team’s existing inbox, or Slack channel, or shared queue — whichever they’re already watching. Not the inbox and Slack: in practice that means two people see it, both assume the other will reply, and neither does. Pick one place per business and route everything there.
The format matches the destination. To Slack, the summary becomes a one-line preview with a button to expand the full transcript. To email, the summary is the subject line and the transcript is the body. To a queue, it’s a structured row. Same payload underneath, three packagings.
Step 4: hold the session for a moment
After step three, the websocket doesn’t close immediately. The visitor often types one more thing — the order number they forgot to include, the email they meant to send, “actually never mind, I figured it out.” The session stays open for a couple of turns; any new turns get appended to the transcript and forwarded as a follow-up note to the same destination. After that, the websocket closes and the session ends.
The “actually never mind” case is the underrated one. The system should let the visitor cancel cleanly — the assistant acknowledges, the team gets a small “visitor self-resolved” note appended to the original handoff, and the human knows not to send a wasted follow-up.
What this is not
This is on purpose not a full ticketing system. There’s no priority queue, no reply-time clock, no auto-assignment, no agent dashboard. For a small business, those are too much on day one and a chore every day after. The handoff is a thin pipe from the chat into your existing inbox; the inbox you already use becomes the “dashboard.” If you outgrow that, you can plug in a real ticketing tool later — but most businesses never need to.
All posts