Part 6 of 7 · Document pipeline series ~3 min read

What the document pipeline costs

A pipeline that reads documents for you doesn’t have to cost real money. Most of the system runs on free tiers; the only meaningful spend is the AI extraction itself, and even that’s pennies per document.

Anatomy of a typical monthly bill: three tiers and a total Three columns side by side, each labeled at the top. The left column, “Always free”, lists Lambda runs, pipeline orchestration, Lambda Function URLs (webhook endpoints), SQS and SNS, and small DynamoDB tables — each with a cost of zero. The middle column, “Tiny fixed cost”, lists S3 storage at pennies per month and Secrets Manager at about 40 cents per secret per month. The right column, “Grows with use”, lists reading pages (Textract) at pennies per page, AI structuring at pennies per document, and notifications at fractions of a cent. Below all three columns, a wide total bar reads: total, about one to five dollars per month for typical SMB volume of around one hundred documents per month. Bottom note: more documents means more spend, but still cents per document, not dollars. Where the dollars go in a typical month Always free $0 Lambda runs free Orchestration free Webhook URLs free Queues, alerts free Small tables free all under the perpetual free tier Tiny fixed cost ~ $0.50/mo S3 storage cents Password vault ~$0.40 each flat, regardless of volume Grows with use ~ $0.50–$4/mo Reading pages pennies/page AI structuring ~¢/doc Notifications <1¢/mo scales with how busy the inbox is Total, typical month about $1–$5 / month at SMB volume ~100 documents/month; budget alarm at $15 catches anything weird More documents means more spend — but still cents per document, not dollars.
Fig 6. Three tiers of cost: free, tiny fixed, variable. The variable column is where most of the bill lives, and even there the unit cost is cents per document.

Free at this scale

  • The robots — the cloud gives you a million runs a month for free, more than the pipeline will ever need.
  • Pipeline orchestration — the workflow engine that strings the steps together charges per state transition. At SMB volume that’s fractions of a cent per month — below the noise floor of the bill.
  • Webhook URLs — the upload endpoint and the destination callbacks. Free at this volume.
  • Queues and alerts — the small bookkeeping pieces. All free.
  • Small tables — the audit log, the doc metadata. Comfortably under the always-free quota.

Costs pennies to a dollar each month

  • Storage — the original documents and the structured outputs. Pennies per month at SMB volume; cheaper if you move old files to cold storage automatically.
  • Password vault — about 40 cents a month for each secret you store (the access keys for your destinations).

Grows with how busy the inbox is

  • Reading pages. The specialist AI charges per page. The default cheap mode (text + layout) rounds to pennies per page — great for receipts and most invoices. For forms-heavy documents that need precise table extraction, a richer mode is available at about a nickel per page; the rules file picks per type.
  • AI structuring. The generalist AI works on the small clean output of the reader, not the original document — so even on long documents the cost stays around a cent or two per document.
  • Notifications. Slack messages are free. Emails are a fraction of a cent.

Three traps you’re avoiding

  • No always-on server — would be $30+ a month before reading anything.
  • No paid document-reading service — those charge a dollar or more per document and usually require a monthly minimum. The pipeline has no minimum and costs less per document.
  • No infinite logs — 7-day retention; logs can’t pile into a slow-growing surprise bill.

In plain words

For an SMB processing a hundred or so documents a month, the bill is a coffee. Push that to a thousand and the bill goes up — but per document it’s still cents, not dollars. Set a budget alarm that fits the volume you actually expect, and the bill can’t surprise you.

All posts