Series · 7 parts Published June 9, 2026

Transcription archive

A serverless archive that keeps every call and meeting recording your business makes, turns each one into text, files it by date, people, and topic, and lets you ask a plain-language question — “find where we discussed the Acme contract” — and get the exact moment back with a quote and a link to play it. Staff still own every decision; the archive just makes the record findable. Access is controlled and every search is logged. Seven posts on the same system — one diagram at a time — with an engineering reference at the end.

  1. 01

    A transcription archive on AWS for a few dollars a month

    The whole system on one page — an intake, a filing-and-indexing piece, and a search piece, plus the way every recording becomes a findable, quotable record.

  2. 02

    How a recording gets filed

    Three lanes feed the archive — a watched Drive folder, a forward-to-an-address lane, and a scheduled pull from your meeting tool — then Transcribe turns speech to text and a plain-Python step files it by date, people, and topic.

  3. 03

    How a transcript becomes searchable

    Each transcript is split into short timed chunks, every chunk is turned into a vector by Titan Text Embeddings V2, and the vectors land in S3 Vectors so search can match by meaning, not just exact words.

  4. 04

    How a question finds the moment

    A plain-language question becomes a vector, matches the closest chunks, filters them by who’s allowed to see them, and Haiku 4.5 writes a short answer with a direct quote linked to the exact second in the audio.

  5. 05

    How the archive stays private

    Access tags per recording, search results filtered before the answer is written, sensitive recordings kept out of open search, and a full search log of who asked what and when.

  6. 06

    What the transcription archive costs

    A few dollars a month at SMB volume. Transcription runs once per recording, embeddings and the answer model fire on a small fraction of the work, and there’s no always-on compute to pay for.

  7. 07

    Engineering reference: the transcription archive architecture

    Same system, drawn purely for engineers. Service names, resource identifiers, region, Bedrock model IDs, the Transcribe job config, Lambda inventory, IAM scopes, the S3 Vectors index, and the DynamoDB schemas.

What is a transcription archive?
A small serverless system that keeps every call and meeting recording your business makes as a searchable, organized record. Recordings come in; it transcribes each one, files it with the date, the people, and the topic, and makes the whole back-catalogue searchable in plain language. Ask “find where we discussed the Acme contract” and it returns the exact moment with a quote and a link to play it. Staff still own every decision; the archive just makes the record findable.
How much does it cost to run?
About $4/month at typical small-business volume (around 60 recordings a month, roughly 40 hours of audio). The fixed cost is essentially zero. The variable cost is dominated by transcription, which runs once per recording; embeddings and the answer model fire on a small fraction of the work. At 250 recordings a month the bill lands around $14.
Which AWS services does it use?
Lambda (Python 3.14, arm64) with Function URLs for the search box, Amazon Transcribe for speech-to-text, S3 (with versioning) for audio and transcripts, S3 Vectors for the searchable index, DynamoDB on-demand for the catalogue and the search log, EventBridge for the pipeline steps, SES outbound, Secrets Manager, CloudWatch Logs (7-day retention), AWS Budgets, and Bedrock (Titan Text Embeddings V2 to index transcripts, Claude Haiku 4.5 to write the short answer). No API Gateway, no NAT Gateway, no always-on compute.
Where do recordings come from?
Three lanes. Drop an audio or video file into a watched Drive folder. Forward a meeting recording to a dedicated address. Or let a small connector pull finished recordings from your meeting tool’s cloud on a schedule. All three land the file in S3, which starts the same filing pipeline. The Drive folder stays the place a person browses; S3 is where the system works.
Does the archive use AI?
In two narrow spots. Amazon Transcribe turns speech into text. Titan Text Embeddings V2 turns each chunk of a transcript into a vector (a list of numbers that captures meaning) so search can match by meaning, not just exact words. Claude Haiku 4.5 writes the one-line answer and picks the quote when you search. The filing step — date, people, topic tags — is plain Python. Most of the system is deterministic by design.
How does search return the exact moment?
Every transcript is split into short timed chunks, each carrying its start time in the recording. Your question is turned into the same kind of vector and matched against the index; the closest chunks come back with their timestamps. Haiku 4.5 reads the top few chunks and writes a short answer with a direct quote, and the result links straight to that second in the audio. You get the moment, not a pile of files to scrub through.
Who can see what, and is it logged?
Each recording carries an access tag — a team, a person, or “everyone.” The search box only returns chunks the asker is allowed to see, filtered before the answer is ever written. Every search is recorded in the tx-searchlog DynamoDB table with who asked, what they asked, which recordings were returned, and when — so the trail is auditable for years. Sensitive recordings can be marked so they never appear in open search at all.
All posts