Part 6 of 7 · Transcription archive series ~3 min read

What the transcription archive costs

The archive does its expensive work once per recording, not once per search. Each recording is transcribed and indexed when it arrives; after that, searching it is almost free. So the bill tracks how many recordings come in, not how often people ask questions. Transcription is the biggest line by a wide margin. Embeddings, the answer model, and the vector store are small slivers. At typical SMB volume the whole thing is a few dollars a month, fixed cost essentially zero.

Key takeaways

  • Around $4/month at typical SMB volume (about 60 recordings, ~40 hours of audio).
  • Fixed AWS cost is essentially zero. No always-on compute, no NAT Gateway, no API Gateway.
  • Transcription dominates the bill — it runs once per recording and scales with hours of audio.
  • Embeddings, the answer model, and S3 Vectors are small; searching is cheap because indexing already happened.
  • At 250 recordings a month the bill is around $14. At 350 it’s around $19.

Cost at three volumes

Monthly cost at three recording volumes, broken out by component A vertical stacked-bar chart showing monthly cost in US dollars at three recording volumes. The leftmost bar represents about 60 recordings a month and shows a total around $4, dominated by the Transcribe slice with smaller slivers for Bedrock (embeddings plus the answer model), S3 Vectors, and an everything-else bucket. The middle bar represents about 250 recordings and shows a total around $14, with the same shape — Transcribe grows roughly linearly with hours of audio because it runs once per recording. The rightmost bar represents about 350 recordings and shows a total around $19, with Transcribe still dominant; Bedrock, S3 Vectors, and the everything-else bucket stay small in absolute terms because embeddings run once per recording and the answer model fires only on actual searches. Below the chart is a legend explaining the four sections of each bar: Transcribe (speech to text, once per recording), Bedrock (Titan embeddings at index time plus Haiku 4.5 on searches), S3 Vectors (the searchable index), and an everything-else bucket for Lambda runtime, DynamoDB on-demand, S3 storage, EventBridge, SES, and CloudWatch. A note at the bottom: transcription is the dominant cost, and it scales with hours of audio, not with how often people search. $0 $5 $10 $15 $20 60 recs ~$4 250 recs ~$14 350 recs ~$19 Transcribe (speech to text, once per recording) Bedrock (Titan embeddings + Haiku answers) S3 Vectors (the searchable index) Everything else (Lambda, DDB, S3, EventBridge, SES, CloudWatch) Transcription is the dominant cost — it scales with hours of audio, not with how often people search.
Fig 6. Monthly cost at three recording volumes. Transcribe dominates because it runs once per recording and tracks hours of audio. Bedrock, S3 Vectors, and the everything-else bucket stay small — embeddings run once and the answer model fires only on real searches.

Where the dollars actually go

Amazon Transcribe (the bulk). This is the big line, and it’s priced per minute of audio. A typical SMB recording a couple of dozen hours of calls and meetings a month pays a few dollars for it. Transcribe runs exactly once per recording — never again, no matter how many times the recording is searched — so the cost tracks how much audio comes in, not how busy the search box is. The standard tier is fine for most calls; the cheaper batch tier suits the meeting-tool pull lane where there’s no rush.

Bedrock (embeddings + answers). Two callsites. Titan Text Embeddings V2 runs once per chunk at index time — a few dozen chunks per recording, a fraction of a cent each, so a cent or two per recording. Claude Haiku 4.5 fires only when somebody actually searches: a few thousand input tokens (the retrieved chunks) and a couple hundred output tokens (the short answer), so a fraction of a cent per search. Even a busy team searching dozens of times a day keeps the answer-model cost under a dollar a month.

S3 Vectors. The searchable index. You pay for the vectors you store and the searches you run, with no always-on index server. A few thousand vectors at SMB scale is cents a month; the per-search cost is tiny. This is the part that would be expensive with an always-running search cluster — and isn’t, because it’s pay-per-use.

Lambda runtime. Every step is a short Lambda: the sync lanes, the filing step, the indexer, the search handler. None run long, none run often. Under a dollar a month at all three volumes.

DynamoDB on-demand. Three small tables: tx-catalogue, tx-searchlog, and the access records. A read or two per search, a write per recording and per query. Pennies a month at any of these volumes.

S3 storage + SES. The audio, the transcripts, and the raw forwarded emails. Audio is the heaviest, but a month of SMB recordings is a few gigabytes — cents — and old audio can move to a cheaper storage class. SES inbound for the forwarding lane is $0.10 per thousand messages: negligible.

What doesn’t cost money

  • API Gateway. Replaced by a Lambda Function URL for the search box.
  • NAT Gateway. Nothing is in a VPC. No NAT, no $32/month minimum.
  • Always-on compute. No EC2, no Fargate, no running search cluster. Everything is per-use.
  • Re-transcription. Transcripts are kept, so re-indexing on a model upgrade never re-runs the expensive speech-to-text step.
  • Models on every search. The embeddings are done at index time; a search is one cheap query embedding plus one short Haiku call.

How the cost scales

Transcribe and embeddings grow with how much audio you bring in, because each recording is processed once. The answer model grows with how often people search — a separate, smaller dial. Storage grows slowly with the back-catalogue. So the bill at 1,000 recordings a month is around $50, dominated by transcription; at 2,500 it’s around $120. Past those volumes you’d look at the batch transcription tier and lifecycle rules to move old audio to cold storage, but those are tunings, not redesigns.

Set an AWS Budgets alarm at $25/month so anything unusual pages you before the bill matters. The archive’s normal-volume bill stays well under that ceiling.

Last post in the series: the engineering reference. Same system, drawn for engineers — service names, the Transcribe job config, Lambda inventory, IAM scopes, the S3 Vectors index, Bedrock model IDs, and the DynamoDB schemas.

All posts