How an item gets tracked
The watcher only watches what’s in the registry. So the first job is making sure the registry actually reflects what your business has. There are three ways an item gets in: somebody types a row in the Drive sheet, somebody forwards a PDF contract to a dedicated address, or somebody puts an event on their Google Calendar with a small tag. The first one is obvious. The other two exist because in real life nobody types a row in a sheet for the contract they signed three minutes ago.
Key takeaways
- Three intake lanes feed one registry: the Drive sheet, an inbox-forwarding lane, and a calendar import.
- Inbound PDFs are parsed by Textract; Bedrock Haiku 4.5 reads the text and proposes a row.
- Every parsed row goes to a rep’s Slack for one-tap approval before it lands in the registry.
- Calendar events tagged
#expiresget pulled hourly via the Google Calendar API. - The registry stays the canonical store. The other lanes are conveniences that write into it.
Three lanes into one registry
Lane 1: the Drive sheet itself
The simplest lane. Open the registry sheet in Drive, add a row, save. The columns are short: name, category, owner email, vendor, expiry date, renewal cost, contract value, and a link to the source document. A small Lambda — drive-sync — runs every fifteen minutes, exports the sheet as plain CSV via the Drive API, and writes it to s3://ew-registry-source/registry.csv if the sheet has changed since the last sync. The watcher reads from S3, not Drive directly. That keeps Drive API calls predictable and gives you S3 versioning for free, so a bad bulk-edit can be rolled back in one click.
This lane covers the cases where you already have a contract, you know when it expires, and you can spend thirty seconds typing it in. Most existing items go in this way during the initial setup.
Lane 2: inbox forwarding (the lane most teams actually use)
Set up a dedicated inbound address — something like expires@your-company.com — via Amazon SES. Anyone on the team forwards a contract PDF to that address and the watcher takes it from there. SES writes the raw MIME to s3://ew-raw-mime/. The S3 PUT triggers a parser Lambda. The Lambda walks the MIME tree to the PDF attachment, runs Amazon Textract on it (Textract reads PDF, PNG, JPEG, and TIFF natively; if somebody forwards a Word document, the parser falls back to python-docx), and gets back the extracted text plus any tables.
Then a Bedrock Haiku 4.5 call reads the text and emits a structured row: name, vendor, category, expiry date, renewal cost (if present in the document), contract value (if present), and an owner-suggestion based on the “To” line of the original forward. The model prompt is short: “Extract a row for the registry. Return JSON only. Mark each field with a confidence score. Do not invent a date that isn’t in the text.” The output goes to a small Slack interactive message that pings the rep who forwarded the email: the proposed row, the confidence per field, and three buttons — approve, edit, discard. On approve, a Lambda writes the row to the Drive sheet via the Sheets API. On edit, the rep gets a fillable modal pre-populated with the proposal. On discard, the message is logged and the PDF moved to a discarded prefix in S3 for audit.
The reason every parsed row goes to a human first is simple: a contract expiry the model misread is worse than a contract that never made it into the registry at all. The misread one will quietly tell you everything is fine until the morning the policy lapses.
Lane 3: calendar import
Some teams already track renewals on a calendar. The lease is on Maria’s personal calendar with a yellow flag. The SOC 2 audit window is on the engineering calendar. The cyber-insurance renewal is on the office manager’s calendar. Forcing those teams to also type rows in a sheet is a fight you don’t need to have on day one.
Lane 3 picks up calendar events tagged with #expires in the description. A small calendar-sync Lambda runs hourly, iterates through the configured Google Calendars (using a service-account credential stored in Secrets Manager), and pulls any events with the tag whose start time is in the future. Each pulled event becomes a proposal in the same Slack flow as Lane 2 — one-tap approve to add to the registry. Once approved, the calendar event itself can stay where it is or be deleted; the registry now owns the renewal.
Calendar import is the most opt-in of the three lanes. A team that doesn’t use it loses nothing; a team that does avoids retyping things they already typed once.
Why the registry stays the source of truth
Three lanes in, but only one place where the watcher actually looks. That’s a deliberate constraint. If two lanes both wrote directly to the watcher’s state, every “why did this ping go out?” question would mean checking three places. Funneling everything through the Drive sheet means there is exactly one row per item, and any rep can read or edit any of it without learning a new tool. The convenience lanes are first-class for getting items in, but they always pass through the sheet on the way.
Next post: how the watcher actually reads the registry, computes days-to-expiry, and picks one of four moves.
All posts