What the staff policy answerer costs

Key takeaways

Around $2/month at a small team’s volume (roughly 150 questions a month).
Fixed AWS cost is essentially zero. No always-on compute, no NAT Gateway, no API Gateway, no search server.
The biggest variable cost is Bedrock — one small answer call per question.
Keeping the index fresh is cheap: only changed sections get re-embedded.
At ~600 questions/month the bill is around $5. At ~2,400 it’s around $14.

Cost at three question volumes

Fig 6. Monthly cost at three question volumes. Bedrock is the largest slice because the answer call fires once per question, and questions are what grow. Embeddings are smaller, and the fixed cost is a rounding error.

Where the dollars actually go

Bedrock (the biggest slice). Each question that clears the confidence floor triggers one Claude Haiku 4.5 call: a few hundred tokens for the prompt and the pulled sections in, a couple of hundred tokens of answer out. That’s a fraction of a cent per question. Questions that fail the confidence floor or hit an off-limits topic skip the model entirely and cost nothing. At 150 questions a month it’s well under a dollar; at 2,400 it’s the dominant line item but still single-digit dollars. Haiku is the right model here — reading a few short policy sections and writing a plain answer doesn’t need a heavier model, and the cheaper call keeps the per-question cost tiny.

Embeddings (Titan). Two places. One small embedding per question (to turn it into a vector for search) — cents per thousand questions. And the index refresh: re-embedding only the handbook sections that changed. Since a typical handbook changes a few sections a week, the refresh embeddings are negligible. Embedding the whole handbook once on day one is a one-time cost of a few cents.

S3 Vectors. Stores the section fingerprints and answers search queries. You pay for the stored vectors (a small handbook is a few hundred to a few thousand vectors — pennies) and per query (one query per question). No always-on search cluster to pay for, which is the whole reason to use it over a hosted vector database. Pennies a month at these volumes.

Lambda runtime. The intake Lambda, the answerer, the indexer sync, and the gap-report job. All event- or schedule-driven, all small. Even at 2,400 questions a month the Lambda total lands under a dollar.

DynamoDB on-demand. Two small tables: the question-and-answer log and the refresh audit log. A handful of writes per question and per refresh. Pennies a month.

SES. Inbound for the email lane and outbound for email replies: $0.10 per thousand messages each way. Negligible unless your whole company is on the email lane.

What doesn’t cost money

API Gateway. Replaced by Lambda Function URLs for the Slack endpoints.
NAT Gateway. Nothing is in a VPC. No NAT, no $32/month minimum.
Always-on compute. No EC2, no Fargate. Nothing runs unless a question comes in or a doc changes.
A hosted vector database. S3 Vectors means no search cluster sitting idle and billing by the hour.
Re-reading the whole handbook. The index refresh touches only the sections that changed, so a big handbook isn’t a big bill.

How the cost scales

The bill tracks questions, not handbook size or headcount directly. Bedrock and the per-question embedding grow linearly with how many questions get asked; everything else stays small. So a company asking 5,000 questions a month lands around $28, and 10,000 around $55 — still less than an hour of the HR time those questions would otherwise eat. A bigger handbook barely moves the needle: more sections means slightly more stored vectors and a slightly larger search, both cheap. The thing that grows your bill is your team getting more answers, which is the point.

Set an AWS Budgets alarm at $20/month so anything unusual — a runaway loop, a misconfigured retry — pages you before the bill matters. The normal-volume bill stays well under that ceiling.

Last post in the series: the engineering reference. Same system, drawn for engineers — service names, Lambda inventory, IAM scopes, the S3 Vectors index config, Bedrock model IDs, the Slack app config, and the DynamoDB schemas.

All posts