Part 6 of 7 · Staff policy answerer series ~3 min read

What the staff policy answerer costs

This is a cheap system to run. There’s no always-on server, no database humming at 3am, no model call unless somebody actually asks a question. The cost scales with how many questions staff ask — not with how big your handbook is — because the handbook is read once and only re-read when it changes. At typical SMB volume, the bill is a couple of dollars a month, fixed cost essentially zero.

Key takeaways

  • Around $2/month at a small team’s volume (roughly 150 questions a month).
  • Fixed AWS cost is essentially zero. No always-on compute, no NAT Gateway, no API Gateway, no search server.
  • The biggest variable cost is Bedrock — one small answer call per question.
  • Keeping the index fresh is cheap: only changed sections get re-embedded.
  • At ~600 questions/month the bill is around $5. At ~2,400 it’s around $14.

Cost at three question volumes

Monthly cost at three question volumes, broken out by component A vertical stacked-bar chart showing monthly cost in US dollars at three question volumes. The leftmost bar represents about 150 questions a month and shows a total around $2, with Bedrock as the largest slice (one small answer call per question), a smaller embeddings slice (one question embedding per question plus the cheap index refresh), a tiny fixed slice, and an everything-else slice for Lambda, DynamoDB, S3 Vectors, and SES. The middle bar represents about 600 questions a month and shows a total around $5, with the same shape but each slice larger because more questions mean more model and embedding calls. The rightmost bar represents about 2,400 questions a month and shows a total around $14, with Bedrock now the dominant cost because the answer call fires once per question and that count is what grows. Below the chart is a legend explaining the four sections of each bar: Bedrock (one answer call per question, Claude Haiku 4.5), Embeddings (Titan, one per question plus index refresh), Secrets Manager and Budgets (small fixed amounts), and an everything-else bucket for Lambda runtime, DynamoDB on-demand, S3 Vectors storage and queries, and SES. A note at the bottom: cost tracks the number of questions asked — a bigger handbook costs almost nothing extra. $0 $5 $10 $15 $20 150 / mo ~$2 600 / mo ~$5 2,400 / mo ~$14 Bedrock (one answer call per question, Haiku 4.5) Embeddings (Titan, per question + index refresh) Secrets Manager + Budgets (fixed) Everything else (Lambda, DDB, S3 Vectors, SES) Cost tracks the number of questions asked — a bigger handbook costs almost nothing extra.
Fig 6. Monthly cost at three question volumes. Bedrock is the largest slice because the answer call fires once per question, and questions are what grow. Embeddings are smaller, and the fixed cost is a rounding error.

Where the dollars actually go

Bedrock (the biggest slice). Each question that clears the confidence floor triggers one Claude Haiku 4.5 call: a few hundred tokens for the prompt and the pulled sections in, a couple of hundred tokens of answer out. That’s a fraction of a cent per question. Questions that fail the confidence floor or hit an off-limits topic skip the model entirely and cost nothing. At 150 questions a month it’s well under a dollar; at 2,400 it’s the dominant line item but still single-digit dollars. Haiku is the right model here — reading a few short policy sections and writing a plain answer doesn’t need a heavier model, and the cheaper call keeps the per-question cost tiny.

Embeddings (Titan). Two places. One small embedding per question (to turn it into a vector for search) — cents per thousand questions. And the index refresh: re-embedding only the handbook sections that changed. Since a typical handbook changes a few sections a week, the refresh embeddings are negligible. Embedding the whole handbook once on day one is a one-time cost of a few cents.

S3 Vectors. Stores the section fingerprints and answers search queries. You pay for the stored vectors (a small handbook is a few hundred to a few thousand vectors — pennies) and per query (one query per question). No always-on search cluster to pay for, which is the whole reason to use it over a hosted vector database. Pennies a month at these volumes.

Lambda runtime. The intake Lambda, the answerer, the indexer sync, and the gap-report job. All event- or schedule-driven, all small. Even at 2,400 questions a month the Lambda total lands under a dollar.

DynamoDB on-demand. Two small tables: the question-and-answer log and the refresh audit log. A handful of writes per question and per refresh. Pennies a month.

SES. Inbound for the email lane and outbound for email replies: $0.10 per thousand messages each way. Negligible unless your whole company is on the email lane.

What doesn’t cost money

  • API Gateway. Replaced by Lambda Function URLs for the Slack endpoints.
  • NAT Gateway. Nothing is in a VPC. No NAT, no $32/month minimum.
  • Always-on compute. No EC2, no Fargate. Nothing runs unless a question comes in or a doc changes.
  • A hosted vector database. S3 Vectors means no search cluster sitting idle and billing by the hour.
  • Re-reading the whole handbook. The index refresh touches only the sections that changed, so a big handbook isn’t a big bill.

How the cost scales

The bill tracks questions, not handbook size or headcount directly. Bedrock and the per-question embedding grow linearly with how many questions get asked; everything else stays small. So a company asking 5,000 questions a month lands around $28, and 10,000 around $55 — still less than an hour of the HR time those questions would otherwise eat. A bigger handbook barely moves the needle: more sections means slightly more stored vectors and a slightly larger search, both cheap. The thing that grows your bill is your team getting more answers, which is the point.

Set an AWS Budgets alarm at $20/month so anything unusual — a runaway loop, a misconfigured retry — pages you before the bill matters. The normal-volume bill stays well under that ceiling.

Last post in the series: the engineering reference. Same system, drawn for engineers — service names, Lambda inventory, IAM scopes, the S3 Vectors index config, Bedrock model IDs, the Slack app config, and the DynamoDB schemas.

All posts