Part 6 of 7 · Website chat assistant series ~3 min read

What the chat assistant costs

A coffee a month at typical SMB volume. The fixed cost is essentially zero — if the widget is quiet for a week, the bill for that week is too. The variable cost is dominated by Bedrock tokens for the answerer and the managed knowledge-base queries; everything else rounds to pennies.

Cost structure: three tiers A horizontal three-tier cost diagram. Tier 1, "Always-free or near it": Lambda invocations, DynamoDB pay-per-request for sessions and scratchpads, S3 storage for transcripts at small volume, CloudWatch log retention at 7 days, EventBridge schedules for the weekly grouping job, SNS topic for handoff notifications, and the API Gateway WebSocket idle minutes. Tier 2, "Per-conversation pennies": API Gateway WebSocket messages at about $1 per million, Bedrock Haiku tokens for the answerer at fractions of a cent per turn, Bedrock managed knowledge-base queries at a small fixed-cost-per-query, SES outbound for handoff emails at $0.10 per 1000. Tier 3, "Optional": SMS notifications via SNS or a third-party SMS API (cents per message, off by default), a custom domain for the websocket endpoint (your registrar), and SES domain identity (free). A bottom note reads: a typical SMB at 500 conversations a month lands well under five dollars total. tier 1 Always-free or near it Lambda invocations ~ $0.00 DynamoDB (sessions) ~ $0.05 S3 (transcripts) ~ $0.05 CloudWatch (7-day) ~ $0.10 EventBridge schedules ~ $0.00 SNS topic tier 2 Per-conversation pennies API Gateway WebSocket msgs ~ $1 / million msgs Bedrock Haiku (answerer) fractions of a cent / turn Knowledge base queries small fixed / query SES outbound (handoffs) $0.10 / 1000 emails tier 3 Optional SMS notifications cents/message, off by default Custom WS domain your registrar SES domain identity free Workspace OAuth free ~ 500 conversations a month → under five dollars total, often under three.
Fig 6. Three tiers of cost. The bill scales with how often the widget gets opened; the floor is nearly zero.

The fixed cost is essentially nothing

There is no per-seat licence and no minimum monthly fee. If your site has a quiet week, the AWS bill matches. Lambda, DynamoDB pay-per-request, S3, CloudWatch, EventBridge, and SNS all sit in or near the always-free tier at the volumes a small business chat sees. The biggest single line item in tier one is usually CloudWatch log storage, and even that you control by retaining seven days instead of forever.

The API Gateway WebSocket has a tiny per-connection-minute charge while a session is open. At a few minutes per conversation and a few hundred conversations a month, it rounds to under fifty cents.

The variable cost is per-conversation pennies

Three things scale with traffic:

  • WebSocket messages — about $1 per million. A conversation is typically tens of messages (each visitor turn plus a streamed reply), so a hundred conversations is a fraction of a cent.
  • Bedrock Haiku tokens — the answerer reads a short prompt and writes a short reply on each turn. At small-model rates, this lands at a fraction of a cent per turn. A 5-turn conversation is still pennies.
  • Knowledge-base queries — one query per turn. The managed retrieval has a small fixed cost per query and a tiny token cost for the embedded passages. Together: still cents at hundreds of conversations.

Add it up: most small businesses end up between $1 and $5 a month total at a few hundred conversations. The widget pays for itself the first weekend it answers “do you ship to Canada?” without a human on call.

Three traps you’re avoiding

  • Per-seat live-chat tools. Most off-the-shelf chat-widget products charge $20–$80 per agent per month, regardless of volume. You’re trading a flat per-seat bill for pay-per-use that mostly comes in pennies.
  • Running a vector store yourself. The managed knowledge base avoids the “tiny database that costs $40 a month and you have to babysit” trap. You write Drive docs; the index follows.
  • Long scratchpads. Memory that grows unbounded multiplies your token bill turn after turn. Trimming to the last few turns keeps cost flat.

When this stops being cheap

The math changes if conversation volume goes up by an order of magnitude or two — thousands of conversations a month with deep multi-turn threads. At that point Bedrock token spend becomes the headline number, not a footnote. There are levers: tighter scratchpad windows, smaller retrieved-passage budgets, an inline cache for repeated questions. Most SMBs never need them.

For everyone below that — and that’s most small businesses — a $10 monthly AWS Budget alarm catches anything strange before it becomes a surprise on the credit card.

In plain words

The fixed bill is nearly zero. The variable bill is cents per conversation, dominated by tokens. A typical small-business setup runs at coffee-money for the whole month. Set a budget alarm that fits your expected volume and the bill can’t surprise you.

All posts