How the archive stays private

Key takeaways

Three controls per recording: tag access (who may see it), lock sensitive (keep it out of open search), log search (record every query).
The access tag is stored on every chunk’s vector, so the filter happens at the search itself.
Locked recordings never appear in open search; they need a named, logged, direct request.
Every search writes a row to the audit trail — who asked, what, what came back, and when.
The default is private: a recording with no tag is visible only to its owner until someone sets one.

Three controls on every recording

Fig 5. Three controls per recording, all feeding one audit trail. Tag access decides who may see it. Lock sensitive keeps the most private recordings out of open search. Log search captures every query. Every decision is recorded.

Control 1: tag access (who may see it)

Every recording gets an access tag when it’s filed. The tag is one of three kinds: a team (“sales,” “support,” “leadership”), a named person, or everyone. The rules doc sets defaults — recordings from the sales meeting tool default to the sales team, recordings forwarded by a manager default to that manager — and a person can override the tag on any recording by editing the catalogue.

The important detail is where the tag lives. When the transcript is indexed (Part 3), the access tag is copied onto every chunk’s vector. So when search runs (Part 4), the filter can drop disallowed chunks at the match step itself, without a second lookup. The asker’s teams are checked against each chunk’s tag, and anything they can’t see is gone before the answer model is even called. A salesperson searching the whole archive simply never sees a sentence from an HR call — not because the answer hid it, but because those chunks were filtered out before any answer existed.

A recording with no tag at all defaults to its owner only. The system fails closed: when in doubt, fewer people can see it, not more.

Control 2: lock sensitive (out of open search)

Some recordings shouldn’t turn up in a broad search even for people who could technically be allowed — a termination call, a legal discussion, a board session. For those, the tag isn’t enough; you don’t want them surfacing from a vague question that happened to match a phrase. So a recording can be marked sensitive. Sensitive recordings are still transcribed and indexed, but their chunks are flagged so the normal search box never returns them, no matter who’s asking or how close the match is.

Reaching a locked recording takes a deliberate, named request — an authorized person opening that specific recording by id, which is itself logged. The point is that the most private material can’t leak out of an innocent broad search. You have to mean to open it, and the system remembers that you did.

Control 3: log search (who asked what)

The first two controls decide what comes back. The third records what was asked. Every search — the asker, the exact question, the recordings that were returned, and the timestamp — writes a row to the tx-searchlog table. This is per query, not per recording: even a search that returned nothing is logged, because “who went looking for the layoff discussion” is itself worth knowing.

The log earns its place in three ways. It’s an audit trail — if a quote turns up somewhere it shouldn’t, you can see who searched for it and when. It’s an early-warning system — a spike in searches for a sensitive topic is a signal worth a human glance. And it’s a quality signal — if the same reasonable question keeps returning nothing, a recording probably never got filed or a topic tag is wrong. The log is written by a path with append-only permission, so a search can record itself but can’t rewrite history.

How the three stack up

The controls are layers, not alternatives. Tag access decides the normal who-sees-what. Lock sensitive removes the riskiest recordings from open search no matter the tag. Log search records every query against both. A recording can be tagged to the sales team, and that’s the everyday control. A recording can additionally be locked, and now even sales can’t stumble on it. And every attempt to find either is written down. Each layer is simple on its own; together they let an SMB hold sensitive recordings in the same archive as everyday ones without turning the search box into a leak.

None of this replaces good judgment about what to record in the first place. But it means the archive earns trust the way a careful filing clerk would: it knows who’s allowed in which drawer, it keeps the locked drawer locked, and it writes down every time someone comes asking.

Next post: the cost breakdown. The whole archive runs in coffee-money territory at SMB volume; Part 6 explains exactly where the dollars go and why transcription dominates the bill.

All posts