Part 6 of 7 · Quote drafter series ~3 min read

What the quote drafter costs

The whole pipeline runs in coffee-money territory at SMB volume. The fixed cost of having it sitting idle is essentially zero — quiet weeks bill nothing. The variable cost is pennies per RFQ, dominated by Bedrock tokens for the extractors and the cover-paragraph composer. PDF rendering only happens when the rep opens a draft. Three numbers below; the shape of the bill is the same at every volume.

Key takeaways

  • Around $4/month at typical SMB RFQ volume (around 200/month).
  • Fixed AWS cost is essentially zero. No always-on compute, no NAT Gateway, no API Gateway.
  • Variable cost is pennies per RFQ — mostly Bedrock tokens.
  • Textract on uploaded PDFs is the second-biggest line item; only the upload lane triggers it.
  • At 1,000 RFQs/month the bill is around $15. The shape doesn’t break.

Cost at three volumes

Monthly cost at three RFQ volumes, broken out by component A horizontal stacked-bar chart showing monthly cost in US dollars at three RFQ volumes. The leftmost bar represents 50 RFQs per month and shows a total around $1.50, mostly Bedrock tokens with a sliver of Textract and a thinner sliver of everything else. The middle bar represents 200 RFQs per month and shows a total around $4, with Bedrock taking about 60 percent of the bar, Textract about 20 percent, and the rest split among DynamoDB, S3, SQS, EventBridge, SNS, SES, and CloudWatch logs. The rightmost bar represents 1,000 RFQs per month and shows a total around $15, with the same proportional split — Bedrock and Textract continue to dominate as volume scales, and the everything-else slice stays small in absolute terms because most of those components are usage-priced and the per-RFQ work is tiny. Below the chart is a small legend explaining the four sections of each bar: Bedrock (extractors plus composer), Textract (only on PDF and DOCX uploads), Bedrock Knowledge Base storage on S3 Vectors plus its query fees, and an "everything else" bucket for Lambda runtime, DynamoDB on-demand, S3 storage and PUTs, SQS, EventBridge, SNS, SES inbound, Secrets Manager, and CloudWatch Logs at 7-day retention. A note at the bottom: fixed cost is essentially zero — the chart shows the full bill, not just variable. $0 $5 $10 $15 $20 50 RFQs/mo ~$1.50 200 RFQs/mo ~$4 1,000 RFQs/mo ~$15 Bedrock (extractors + composer) Textract (PDF/DOCX uploads only) Knowledge Base / S3 Vectors Everything else (Lambda, DDB, S3, SQS, EB, SNS, SES, KMS, CW) Fixed cost is essentially zero — the chart shows the full bill, not just variable.
Fig 6. Monthly cost at three RFQ volumes. Bedrock dominates at every volume; Textract is the second line item, but only fires on PDF and image uploads. The shape is linear — doubling RFQs roughly doubles the bill.

Where the dollars actually go

Bedrock (the bulk). Each RFQ runs three small extractor calls (line items, constraints, context) and one composer call (the cover paragraph). On Claude Haiku 4.5 via Global cross-Region inference, each call uses roughly a few thousand input tokens (the RFQ + a small system prompt + retrieved catalog rows for the line items extractor) and emits a few hundred output tokens. At Haiku’s pricing, that’s roughly a penny per RFQ on average. Volume scales linearly: 50 RFQs ≈ $0.50 in Bedrock alone, 200 ≈ $2, 1,000 ≈ $10. The clarify and out-of-scope moves use fewer model calls and cost less.

Textract (only when an upload lane fires). The web form and inbox lanes don’t pay Textract unless an inbound email has a PDF or image attached. Textract reads PDF, PNG, JPEG, and TIFF; DOCX and XLSX are read in-Lambda with python-docx and openpyxl, which don’t cost extra. Textract’s pricing is per-page; a typical RFQ attachment is one to three pages, so each upload-lane RFQ costs a few cents. At 200 RFQs/month with maybe a quarter coming through uploads, Textract is around $0.80 to $1. At 1,000 RFQs the same share lands around $4. If your customer base skews heavily toward big tenders with multi-page spec docs, Textract becomes a bigger slice.

Knowledge Base on S3 Vectors. Storage is cheap; query fees scale with retrieval count. The line-items extractor queries the catalog Knowledge Base once per RFQ (with a small batch of candidate line items in the same query). At 200 RFQs/month, query costs are well under $0.50. Embedding ingestion only happens when the catalog or rules doc changes; for a stable catalog, that’s near-free.

Everything else. Lambda runtime: a few hundred milliseconds per RFQ across the intake, drafter, pricer, composer, and dispatch functions. With Lambda Function URLs in front (no API Gateway), there’s no per-request fee on the webhook layer. DynamoDB on-demand for the audit and dedupe tables: pennies a month at any of these volumes. SES inbound: $0.10 per thousand received emails. SQS, EventBridge, SNS, Secrets Manager: in total under a dollar a month. CloudWatch Logs at 7-day retention is the largest piece in this bucket, and even it lands well under a dollar at 200 RFQs/month.

What doesn’t cost money

The pieces a more conventional setup would charge for — that this design avoids:

  • API Gateway. Replaced by Lambda Function URLs. The webhook layer doesn’t have a per-request fee.
  • NAT Gateway. Nothing in the system needs outbound internet from a VPC, so no NAT, no $32/month minimum.
  • Always-on compute. No EC2, no Fargate task running 24/7. Quiet weekends bill nothing.
  • Pre-rendered PDFs. Drafts that nobody opens never render. PDF generation is on-demand in Lambda when the rep clicks.
  • A scheduler. The 24-hour reminder and 48-hour escalation run via EventBridge Scheduler one-off rules, billed per invocation — cents at SMB volume.

How the cost scales

The variable line items (Bedrock, Textract) grow with RFQ count. The fixed line items (KB storage, CloudWatch) stay nearly flat. Doubling RFQ volume roughly doubles the bill. Cutting volume in half nearly halves it. The system has no “break” volume where costs jump — you don’t suddenly pay for a load balancer or a bigger instance class. The bill at 5,000 RFQs/month is around $70; at 10,000 it’s around $140. Past that volume, the conversation changes — you might fine-tune a smaller model to bring per-RFQ cost down — but that’s a tuning step, not a redesign.

Set an AWS Budgets alarm at $25/month so anything unusual (a runaway loop, a stuck escalator, a flood of regenerated drafts) pages you before the bill matters. The drafter’s normal-volume bill stays well under that ceiling.

Last post in the series: the engineering reference. Same system, drawn purely for engineers — service names, IAM roles, Bedrock model IDs, the SES receiving rule set, the presigned-upload flow, and where each Lambda lives.

All posts