Part 4 of 7 · Website chat assistant series ~4 min read

How a handoff to a human works

The handoff is the most important part of a chat assistant, and the part most assistants get wrong. The bad version makes the visitor repeat themselves to a human; the worse version drops the transcript on the floor; the worst version pretends to escalate and never does. Here’s how to design a clean one.

Key takeaways

  • Four steps in fixed order: tell the visitor with a realistic window, package the transcript, deliver to one destination, hold the session for follow-up turns.
  • The visitor is told before the backend work starts — honest window (“within business hours,” not “within an hour” at 9pm Friday) and human language, not “a ticket has been created.”
  • The package always carries five things: full transcript (no summary substitute), one-line AI-written summary, page URL, contact, and the reason for handoff (out of remit, low confidence, explicit request, or unsafe).
  • One destination per business — inbox, Slack, or shared queue, never both. Two destinations means two people see it, both assume the other will reply, and neither does.
  • The websocket stays open for a couple more turns to catch “actually never mind, I figured it out” — appended as a follow-up note so the human knows not to send a wasted reply.

What a handoff has to do, in order

From the moment the assistant picks the “hand off” tool, four things have to happen, in this order, before the visitor closes the tab:

Four moves on every handoff: tell, package, deliver, hold A vertical flow of four numbered steps. Step one, "Tell the visitor": the assistant streams a single short line back, naming a realistic response window ("a human will follow up within business hours"); the widget asks for an email or phone number if one wasn't captured, and accepts it inline. Step two, "Package the transcript": the cloud assembles a small payload — the full conversation, a one-line summary the AI just wrote, the page the visitor was on, the contact, and the reason the handoff happened. Step three, "Deliver to one place": the payload goes to a single destination — the team's existing inbox, Slack channel, or shared queue — never two destinations at once. Step four, "Hold the rest of the session": the websocket stays open for a couple more turns in case the visitor adds context, and gracefully closes after that. Below: a banner reading "The visitor never repeats themselves to the human. The transcript is the ticket." Step 1 Tell the visitor A single short line, a realistic window, an inline ask for email or phone if needed. Step 2 Package the transcript Full conversation + one-line summary + page URL + contact + reason for handoff. Step 3 Deliver to one place Inbox, Slack, or shared queue — whichever the team already watches. One destination, never two. Step 4 Hold the rest of the session WebSocket stays open for any extra context, then closes gracefully. The visitor never repeats themselves to the human. The transcript is the ticket.
Fig 4. Four steps. The visitor experience is built around “tell first, deliver second” — the message lands before the queue does.

Step 1: tell the visitor first

The very first thing that happens, before any backend work, is that the assistant streams a short, plain line back to the visitor: “Got it — I’ll have a human follow up. We usually reply within business hours.” Two things matter here. The window is honest (not “within an hour” if it’s 9pm Friday) and the language is human (not “a ticket has been created”).

If the assistant doesn’t already have a way to reach the visitor — no email captured earlier, no phone — this is the moment to ask, in the same chat. One question: “What’s the best email to follow up on?” Not a form, not a popup. Just one line in the chat the visitor can reply to. The point is to make “getting in touch” feel like part of the conversation, not a fee for getting an answer.

Step 2: package the transcript

Once the visitor knows a human is coming, the cloud assembles a small payload. The contents are unglamorous but specific:

  • The full transcript — every visitor turn and every assistant turn from this session. Not a summary, not the “important parts.” The whole thing, so the human can read it the way the visitor experienced it.
  • A one-line summary — the AI’s last useful job. What did the visitor want? “Visitor asked about a refund on order #12345 placed two weeks ago, after we couldn’t find a matching policy.” This is what the human reads in the notification preview.
  • Where they were — the page URL the chat opened on. “On the pricing page” is meaningfully different from “on the contact page.”
  • How to reach them — the email or phone they shared, or a session reference if they’re still on the page.
  • Why the handoff happened — one of: out of remit, low confidence, explicit request from the visitor, or hostile/unsafe message. The team uses this to spot patterns.

Step 3: deliver to one place

The package goes to exactly one destination. The team’s existing inbox, or Slack channel, or shared queue — whichever they’re already watching. Not the inbox and Slack: in practice that means two people see it, both assume the other will reply, and neither does. Pick one place per business and route everything there.

The format matches the destination. To Slack, the summary becomes a one-line preview with a button to expand the full transcript. To email, the summary is the subject line and the transcript is the body. To a queue, it’s a structured row. Same payload underneath, three packagings.

Step 4: hold the session for a moment

After step three, the websocket doesn’t close immediately. The visitor often types one more thing — the order number they forgot to include, the email they meant to send, “actually never mind, I figured it out.” The session stays open for a couple of turns; any new turns get appended to the transcript and forwarded as a follow-up note to the same destination. After that, the websocket closes and the session ends.

The “actually never mind” case is the underrated one. The system should let the visitor cancel cleanly — the assistant acknowledges, the team gets a small “visitor self-resolved” note appended to the original handoff, and the human knows not to send a wasted follow-up.

What this is not

This is on purpose not a full ticketing system. There’s no priority queue, no reply-time clock, no auto-assignment, no agent dashboard. For a small business, those are too much on day one and a chore every day after. The handoff is a thin pipe from the chat into your existing inbox; the inbox you already use becomes the “dashboard.” If you outgrow that, you can plug in a real ticketing tool later — but most businesses never need to.

All posts