What the tax doc collector costs
The collector is a cheap system to run. The daily chase tick reads a CSV from S3, does some date arithmetic, writes a few rows to DynamoDB, and sends a handful of emails. It calls no models on the tick. The cost that does add up is the document-reading lane: every upload gets read by Textract and named by Bedrock. Even so, at typical small-practice volume the bill is a few dollars a month, fixed cost essentially zero.
Key takeaways
- Around $2.40/month at typical small-practice volume (around 200 active client files).
- Fixed AWS cost is essentially zero. No always-on compute, no NAT Gateway, no API Gateway.
- The daily chase tick costs pennies — no model calls.
- Textract and Bedrock fire only when a client uploads a document, plus the monthly summary.
- At 500 active files the bill is around $5. At 1,000 files it’s around $9.
Cost at three volumes
Where the dollars actually go
Lambda runtime (the bulk). The chase tick runs once a day. Each tick reads the checklist CSV from S3, iterates the rows, works out what’s missing for each, and decides on a move. At 200 files that’s a few hundred milliseconds; at 1,000 it’s a couple of seconds. Add the dispatch Lambda for each send, the upload-page and action Function URLs, and the drive-sync Lambda every fifteen minutes — the Lambda total still lands under a dollar at all three volumes.
DynamoDB on-demand. Small tables: td-sends, td-uploads, td-state, td-audit. Reads dominate during the daily tick (one read per file, plus state). Writes are sends, uploads, and audit rows. Pennies a month at any of these volumes.
S3 + storage. The mirrored checklist CSV plus every uploaded document. A typical client file is a handful of PDFs and photos — a few megabytes. Even at 1,000 files that’s low single-digit gigabytes. A dollar or two of storage, and that’s being generous.
EventBridge Scheduler. The daily tick rule plus deferred-send rules from the quiet-hours and holiday gates. A few invocations a day. Pennies.
SES. Inbound (if you let clients reply or forward): $0.10 per thousand received. Outbound for the requests and reminders: $0.10 per thousand sent. A 200-file practice sends maybe a few hundred emails across a season — cents.
Textract (one read per upload). Per-page pricing; a typical tax document is one to three pages. A few cents per document. This is the band that scales with volume: more files means more uploads to read. At 200 files with their documents, it’s under a dollar; at 1,000 files it lands around a couple of dollars across the busy months.
Bedrock (only when something fires it). The daily tick uses no Bedrock. The type-confirm fires Haiku 4.5 once per uploaded document: the Textract text in, a short JSON answer out — a small fraction of a cent per call. The monthly summary is one larger call that writes a practice-ready paragraph. Bedrock stays a modest slice even at 1,000 files.
What doesn’t cost money
- API Gateway. Replaced by Lambda Function URLs for the upload page, the intake form, and the action endpoints.
- NAT Gateway. Nothing is in a VPC. No NAT, no $32/month minimum.
- Always-on compute. No EC2, no Fargate. The collector sleeps until a tick fires or a client uploads.
- A Knowledge Base. The checklist is structured rows, not free text — deterministic lookup beats vector search here. No embeddings, no Knowledge Base, no S3 Vectors.
- Models on the tick. The daily decision is plain Python. Bedrock fires only on uploads and the monthly summary.
How the cost scales
Lambda runtime and DynamoDB grow roughly linearly with file count, because every file is evaluated on every tick. Textract and Bedrock grow with upload count, which roughly tracks file count too (each file is a few documents). So the bill at 2,500 active files is around $22; at 5,000 it’s around $42. Past those volumes you’d batch the type-confirm calls and read documents only on the first upload of a session, but those are optimizations for a large practice — not redesigns.
Set an AWS Budgets alarm at $15/month so anything unusual pages you before the bill matters. A normal-volume small practice stays well under that ceiling, even in the thick of February.
Last post in the series: the engineering reference. Same system, drawn for engineers — service names, Lambda inventory, IAM scopes, DynamoDB schemas, SES rule set, and EventBridge Scheduler config.
All posts