What the FAQ builder costs
The builder is one of the cheaper systems in this whole series. It embeds each incoming question, runs a once-a-day grouping pass that calls no model, and only asks the drafting model to write an answer for the handful of clusters that earn one. At typical SMB volume, the bill is a few dollars a month, fixed cost essentially zero. The interesting twist is which line dominates: it’s the embeddings, not the drafting model.
Key takeaways
- Around $3/month at typical SMB volume (around 800 support messages a month).
- Fixed AWS cost is essentially zero. No always-on compute, no NAT Gateway, no API Gateway.
- Embeddings are the biggest variable line — one per incoming question.
- The drafting model is a small sliver — it fires only on candidate clusters, a few a week.
- At light volume the bill is around $1. At heavy volume (5,000 messages) it’s around $12.
Cost at three volumes
Where the dollars actually go
Bedrock embeddings (the bulk). Every incoming question is embedded once with Titan Text Embeddings V2 so the grouper can find repeats. Titan embeddings are priced per token and a cleaned question is short, so each embed is a tiny fraction of a cent — but it’s the one cost that scales directly with how many questions you get. At 800 messages a month it’s the largest single line, and it’s still well under a couple of dollars. Your help docs are embedded too, but that’s a one-time cost paid up front and a small top-up whenever a doc changes.
Bedrock drafting. Claude Haiku 4.5 only runs on candidate clusters — questions that crossed the repeat threshold and aren’t already covered. In steady state that’s a few a week, not a few a day, because most questions either join a covered cluster or sit below the threshold. Each draft is a few thousand input tokens (the retrieved passages) and a few hundred output tokens (a short answer), so a fraction of a cent per draft. The drafting line stays a sliver at every volume.
S3 Vectors. The question vectors and the help-doc vectors live here. Storage is cheap and the per-query cost on the daily grouping pass and the drafter’s retrieval is small. A few cents a month at SMB volume.
Lambda runtime. The intake fires per incoming question, the grouper runs once a day, the drafter runs per candidate, the ack-handler runs per reviewer action, and the drive-sync Lambda runs every fifteen minutes. All short. The Lambda total lands under a dollar at all three volumes.
DynamoDB on-demand. Three small tables: fb-clusters, fb-proposals, fb-audit. Reads and writes are dominated by the daily pass and the approval actions. Pennies a month.
SES + S3. Inbound for the support-inbox lane: $0.10 per thousand received messages, so a couple of cents a month for an SMB. S3 holds the mirrored docs and raw MIME — a few hundred KB, effectively free.
What doesn’t cost money
- API Gateway. Replaced by Lambda Function URLs for the approve and edit endpoints.
- NAT Gateway. Nothing is in a VPC. No NAT, no $32/month minimum.
- Always-on compute. No EC2, no Fargate. The builder only runs when there’s a question to embed or a pass to make.
- A model on the grouping pass. Grouping is plain Python over the vectors. The model only drafts answers.
- A separate vector database. S3 Vectors holds the vectors; there’s no always-on vector store to pay for.
How the cost scales
Embeddings grow linearly with message count, because every question is embedded once. The drafting line grows much more slowly, because the number of new questions worth answering tapers off as the FAQ matures — the more you cover, the more incoming questions land on “already covered” and skip the drafter entirely. So the bill at 10,000 messages a month is around $22; at 20,000 it’s around $42, still embeddings-dominated. Past those volumes you’d batch the embedding calls and maybe sample rather than embed every duplicate, but those are optimizations for high-traffic support desks — not redesigns.
Set an AWS Budgets alarm at $15/month so anything unusual pages you before the bill matters. The builder’s normal-volume bill stays well under that ceiling.
Last post in the series: the engineering reference. Same system, drawn for engineers — service names, Lambda inventory, IAM scopes, DynamoDB schemas, the S3 Vectors config, the SES rule set, and the EventBridge Scheduler config.
All posts