What the chat assistant costs
A coffee a month at typical SMB volume. The fixed cost is essentially zero — if the widget is quiet for a week, the bill for that week is too. The variable cost is dominated by Bedrock tokens for the answerer and the managed knowledge-base queries; everything else rounds to pennies.
Key takeaways
- Three cost tiers: always-free (Lambda, DynamoDB on-demand, S3, CloudWatch 7-day, EventBridge, SNS), per-conversation pennies (WebSocket messages, Bedrock Haiku tokens, KB queries, SES), and optional (SMS, custom WS domain).
- API Gateway WebSocket messages run about $1 per million; a hundred conversations is a fraction of a cent.
- Bedrock Claude Haiku 4.5 tokens dominate the variable bill — fractions of a cent per turn at small-model rates; a 5-turn conversation is still pennies.
- Vector store is Amazon S3 Vectors (managed) with no idle minimum — quiet sites pay nothing for retrieval, unlike OpenSearch Serverless or Aurora pgvector.
- A typical SMB at ~500 conversations a month lands under five dollars total, often under three. A $10 monthly AWS Budgets alarm catches anything strange.
The fixed cost is essentially nothing
There is no per-seat licence and no minimum monthly fee. If your site has a quiet week, the AWS bill matches. Lambda, DynamoDB pay-per-request, S3, CloudWatch, EventBridge, and SNS all sit in or near the always-free tier at the volumes a small business chat sees. The biggest single line item in tier one is usually CloudWatch log storage, and even that you control by retaining seven days instead of forever.
The API Gateway WebSocket has a tiny per-connection-minute charge while a session is open. At a few minutes per conversation and a few hundred conversations a month, it rounds to under fifty cents.
The variable cost is per-conversation pennies
Three things scale with traffic:
- WebSocket messages — about $1 per million. A conversation is typically tens of messages (each visitor turn plus a streamed reply), so a hundred conversations is a fraction of a cent.
- Bedrock Haiku tokens — the answerer reads a short prompt and writes a short reply on each turn. At small-model rates, this lands at a fraction of a cent per turn. A 5-turn conversation is still pennies.
- Knowledge-base queries — one query per turn. The managed retrieval (S3 Vectors as the default vector store, since mid-2025) has a small per-query cost and a tiny token cost for the embedded passages. S3 Vectors has no idle minimum — you only pay when the widget actually runs a query — so a quiet site costs nothing. Together: still cents at hundreds of conversations.
Add it up: most small businesses end up between $1 and $5 a month total at a few hundred conversations. The widget pays for itself the first weekend it answers “do you ship to Canada?” without a human on call.
Three traps you’re avoiding
- Per-seat live-chat tools. Most off-the-shelf chat-widget products charge $20–$80 per agent per month, regardless of volume. You’re trading a flat per-seat bill for pay-per-use that mostly comes in pennies.
- Running a vector store yourself. The managed knowledge base avoids the “tiny database that costs $40 a month and you have to babysit” trap. With S3 Vectors as the default, even the managed option has no idle floor — quiet sites pay nothing for retrieval. You write Drive docs; the index follows.
- Long scratchpads. Memory that grows unbounded multiplies your token bill turn after turn. Trimming to the last few turns keeps cost flat.
When this stops being cheap
The math changes if traffic grows ten or a hundred times — thousands of conversations a month with long multi-turn threads. At that point Bedrock token spend becomes the headline number, not a footnote. There are levers to pull: shorter scratchpads, fewer retrieved passages per turn, a small cache for repeat questions. Most small businesses never need them.
For everyone below that — and that’s most small businesses — a $10 monthly AWS Budget alarm catches anything strange before it becomes a surprise on the credit card.
In plain words
The fixed bill is nearly zero. The variable bill is cents per conversation, dominated by tokens. A typical small-business setup runs at coffee-money for the whole month. Set a budget alarm that fits your expected volume and the bill can’t surprise you.
All posts