Part 2 of 7 · Quote drafter series ~4 min read

How an RFQ reaches the drafter

RFQs don’t arrive at your business through one door. The website “Request a quote” form posts to an HTTPS endpoint. The shared sales inbox sees free-text emails with the request buried in a paragraph or pasted from a spreadsheet. The biggest customers attach a tender document or a spec PDF and expect you to read it. The intake’s job is to fold all three mechanisms into one queue, drop duplicates, and screen out junk before any AI sees a single field.

Key takeaways

  • Three intake lanes feed one queue: website RFQ form, sales inbox via SES, and direct file uploads through a presigned S3 portal.
  • Push when the customer can push; presign-and-park when they have a file. Each lane has its own current-2026 reality.
  • Dedupe drops the same RFQ if it arrives via two channels (email + form) within a short window.
  • Screen kills banned-domain spam, competitor emails, and missing-required-field junk.
  • Both filters run in plain Lambda code — no AI, no token cost.

Three lanes at the door

Three intake lanes funnel into one queue A diagram with three vertical lane columns at the top and a single unified row at the bottom. Lane one, Web form: a buyer fills out the "Request a quote" form on your site; the form posts JSON over HTTPS to a Lambda Function URL; the cloud verifies a per-form shared secret and a hCaptcha or Turnstile token; emits a normalized RFQ record. Lane two, Sales inbox: the shared sales address is an SES inbound rule that writes the raw MIME to S3 and triggers a Lambda; the Lambda parses the message, strips quoted threads and signatures, reads any structured fields out of the body, and runs Textract on PDF or image attachments and reads DOCX or XLSX with python-docx and openpyxl to pull text; emits the same normalized record. Lane three, Direct uploads: a small upload portal hands the buyer a presigned S3 PUT URL; the buyer drops a spec sheet or tender PDF; the S3 PUT event triggers a Lambda; Textract extracts text and any known structured fields; emits the same normalized record. All three lanes write into one shared SQS queue. Below them, a unified row labelled "Normalize, dedupe, screen" sits across the full width: normalize folds source-specific shapes into one common RFQ object; dedupe drops messages already seen in the last 24 hours by message-id (email) or by email-plus-subject hash (form and upload), so a buyer who emails and also fills the form doesn't become two drafts; screen runs free in-Lambda filters for banned-domain spam, competitor email domains, banned-phrase floods, and missing-required-field submissions. An output arrow on the right reads "to drafter, in one shape." A note at the bottom: cheap gates first — nothing reaches the AI until the source-specific noise is stripped. Lane 1 · push Web form • Buyer submits form • Posts JSON to a Lambda Function URL • Cloud verifies a shared secret + captcha • Emits a normalized RFQ to the queue • Latency: under a second Lane 2 · SES inbound Sales inbox • SES receive rule writes MIME to S3 • S3 PUT triggers a parser Lambda • Strips threads, signatures, trackers • PDF + image: Textract DOCX/XLSX: in-Lambda • Emits same shape Lane 3 · presigned S3 Direct uploads • Portal hands buyer a presigned PUT URL • Buyer drops a spec or tender file • S3 PUT event triggers a Lambda • Textract pulls text + line items • Emits same shape Normalize, dedupe, screen one shape · drop dupes (24h window) · spam, competitor, missing-fields all of it in plain Lambda — no model calls, no token cost to drafter, in one shape Cheap gates first — nothing reaches the AI until the source-specific noise is stripped.
Fig 2. Three lanes funnel into one queue. The web form pushes JSON, the inbox parses MIME via SES, and direct uploads ride a presigned S3 URL. Normalize, dedupe, and screen run in plain Lambda before any model sees a single field.

Lane 1: the web form fast path

The fastest, cheapest, cleanest lane. Your “Request a quote” form posts JSON over HTTPS straight to a Lambda Function URL — no API Gateway in the path. The Lambda checks two things before doing any real work: a per-form shared secret in the body, and a captcha token (hCaptcha or Cloudflare Turnstile, whichever you already use). If either check fails, it returns a 401 and stops. If both pass, the Lambda turns the form payload into the common RFQ shape, writes a row to DynamoDB for the audit log, and pushes a message onto SQS for the drafter. Total time: under a second from the buyer’s click to the message hitting the queue.

The form itself stays simple. Required fields: name, business email, company, line items as free text, and a deadline if known. Optional: phone, delivery address, attachments. The cloud doesn’t need a structured line-item input — the drafter handles “240 of A-12 plus 60 of A-12L” in a sentence just fine.

Lane 2: the sales inbox via SES

The lane most real RFQs actually arrive through. A buyer’s purchasing manager forwards a request from a chain of three other people. The thread is six replies deep. There’s a signature with a logo, a quoted block from the original RFP, and a PDF attachment with the actual specs. Most tools choke on a message like this. SES handles it directly.

Set the MX record on a dedicated subdomain (quotes.your-company.com) to AWS SES. Configure a receiving rule set with one rule: write the raw MIME to an S3 bucket and stop. The S3 PUT triggers a parser Lambda. The Lambda walks the MIME tree, picks the latest reply (stripping quoted threads and signatures using a small library — mail-parser-reply works well), reads any structured fields the buyer pasted in (a small block of Field: Value lines if your team trained customers to use them), and pulls the body text out of attachments. Textract handles PDF and image attachments; for DOCX or XLSX, the parser reads them in-Lambda using python-docx and openpyxl, since Textract only accepts PDF and image formats. Tables matter — SMB tender docs love them. The cleaned message + extracted attachment text become one normalized RFQ in the same shape as the form lane.

Latency on the inbox lane is a few seconds end-to-end; SES inbound usually delivers within five seconds, plus a Textract round trip if there’s an attachment. Fast enough that a buyer who fires off a quick “Hey, can you quote 240 of A-12” gets a draft in front of a rep before they’ve closed their email.

Lane 3: direct uploads via a presigned S3 portal

The lane the biggest customers want. They have a tender document, a CAD-driven spec sheet, or a long Excel of part numbers. They don’t want to paste it into a form and they don’t want to email it because it’s 4 MB and includes embedded drawings. They want a link to drop the file at.

The portal is a one-page static site. The buyer types their name and email, accepts the terms, and clicks “Get upload link.” A small Lambda mints a presigned S3 PUT URL with a 30-minute TTL and a fixed key prefix per session. The buyer drops their file straight to S3 from the browser — no file passes through Lambda or your servers. The S3 PUT event fires a parser Lambda. Textract pulls text and any structured tables; if the file matches a known template (an industry-standard tender form, say) the parser maps fields directly. The output is the same normalized RFQ shape as the other two lanes.

This lane is also where the largest RFQs come in — multi-page bids worth real money. The drafter handles them the same way it handles a one-line email; the size of the source doesn’t change the shape of the queue.

Dedupe and screen, before any AI runs

The first thing the queue consumer does is run two free filters — both in plain Python, neither costs a Bedrock token.

Dedupe. The same buyer sometimes hits two lanes within a few minutes. They email the inbox, then fill out the web form because they got distracted halfway through and started over. The drafter shouldn’t process that as two RFQs and produce two drafts. The intake keeps a 24-hour rolling window in DynamoDB keyed by (email, hashed subject or line items) and drops the second arrival as a duplicate. If a buyer sends a real follow-up — new line items, larger quantities — the hash changes and the second one passes through. The audit log records both, so a rep can find either if needed.

Screen. Before dedupe, the intake runs a quick free filter. It checks four things: a banned-domain list (throwaway addresses, vendor sales lists, your own staff testing the form), competitor email domains (you have a list, the drafter doesn’t need to read the message to know), banned-phrase floods (template spam from RFP-generator services that hit a thousand vendors at once), and submissions missing required fields (no email, no line items, no quantities). Anything caught here is archived with a reason and never reaches the drafter. The team only sees real RFQs.

Why this shape

The temptation is to build one “smart inbox” that uses an LLM to figure out, from any incoming message, whether it’s an RFQ, a follow-up, a complaint, or spam. That works on a demo and falls apart in week three. Each lane is shaped differently — a webhook is not an email is not an S3 upload — and merging them too early loses signal you can’t recover. By splitting at the door and merging at the queue, the source-specific work (signature checks, MIME parsing, attachment OCR) stays in the lane that knows how to do it. The drafter sees a clean, uniform RFQ shape and spends its model tokens on the part that’s actually hard: pulling out the line items.

Next post: how the drafter reads that uniform RFQ — the three extractors, the catalog lookup, the confidence scoring, and the move-picker that turns all of it into one of four moves.

All posts