How a lead reaches the intake bot

Key takeaways

Three intake lanes feed one queue: website form, ad platforms (Meta + Google), and the shared sales inbox via SES.
Push when the platform offers it; poll when it doesn’t. Each lane has its own current-2026 reality.
Dedupe drops emails or phones already seen in the last 24 hours.
Screen kills banned-domain spam, competitor emails, and missing-required-field junk.
Both filters run in plain Lambda code — no AI, no token cost.

Three lanes at the door

Fig 2. Three lanes, one queue. Push when the platform offers it, parse when it doesn’t. Cheap filters first — nothing reaches the AI until the source-specific noise is stripped.

Why three different mechanisms

The three lanes look different because the platforms behind them work differently. The intake mirrors those differences honestly instead of pretending they’re uniform.

Your own form is the easiest lane. The form posts JSON over HTTPS to a Lambda Function URL with a shared secret in the body and a captcha token in the headers. The cloud verifies both before doing any work. Captcha alone catches most automated form spam. The shared secret stops anyone from POSTing to the URL directly even if they find it in your page source. You own this lane end-to-end. No platform deprecation can break it.

Meta Lead Ads is push, with a small twist. When a Facebook or Instagram user submits a lead form attached to an ad, Meta’s webhook fires within seconds. But the payload is just the lead_id and the form ID, not the answers. The cloud verifies the X-Hub-Signature-256 HMAC (App Secret as the key, computed over the raw body, constant-time comparison), then makes a separate call to the Graph API to fetch the field values. That second call is the only fragile bit. The page access token expires every 60 days, so a small refresh worker rotates it before it lapses. Pin to v24.0 or v25.0 on outgoing calls. v18.0 sunset on 2026-01-26, v19.0 sunsets on 2026-05-21, and v20.0 sunsets on 2026-09-24.

Google Ads has two patterns. If your campaigns use a lead form asset (Google’s current name for the surface formerly called the Lead Form Extension), you can configure a webhook URL right in Google Ads with a google_key. Google posts the form fields the moment a user submits, and you have the lead immediately. If your campaigns don’t use a lead form asset (search ads with a different lead surface, older campaigns), the alternative is a small hourly poll of the conversion-import API to pick up anything new. Both patterns end at the same normalized record.

The inbox lane catches everything else. Plenty of leads still come in as plain emails: a partner referral, a reply to a cold email, a sign-up from a directory site that doesn’t do webhooks, a forwarded RFP. SES inbound rules accept email at your domain, write the raw MIME to an S3 bucket, and trigger a parser Lambda. The parser strips signatures, quoted threads, and tracking pixels (the same tools the email-assistant series uses), then emits the same normalized record as the other lanes. The bot doesn’t care that the lead came through email; the downstream code reads name, email, and message just like a form submission.

Mixing all three in the same intake means the qualifier doesn’t care which source a lead came from once it’s in the queue. The downstream code never branches on source, only on content. Source is a tag on the record, not a control-flow split.

What “normalize” actually means

Each source hands the cloud a different shape. A web form gives you the field names you defined. Meta Lead Ads returns an array of question-answer pairs keyed by your form’s field names (which aren’t always what you’d call them). Google Ads gives a flat key-value object with mostly stable names. SES gives raw MIME you have to parse. Normalization folds those into one common lead object: a source name, a stable lead ID, the contact (name, email, phone), the company (domain, name if provided), the free-text message, the timestamp, the campaign or page identifier, and a small bag of source-specific extras for the engineering reference.

The reason for putting normalization here, before anything else, is so the rest of the system reads exactly one kind of message. The qualifier doesn’t have to know what Meta calls the email field. It just reads contact.email.

Dedupe and screen, before any AI runs

Two free filters sit between the lanes and the qualifier.

Dedupe drops a lead whose email or phone has been seen in the last 24 hours. Real buyers occasionally submit your form twice in a minute (different tabs, slow connection). Webhooks retry on transient failures. Meta sometimes double-fires a lead webhook if its backend gets unsure. Without dedupe the team gets pinged twice on the same hot lead and one of them goes stale. With it, the second arrival is dropped quietly and the existing row is updated with whatever the second submission added.

Screen handles the obvious junk: banned-domain spam (the same crypto-pump message posted to a thousand contact forms), competitor email domains (a list the sales team maintains), banned-phrase floods (“buy our SEO service”), and submissions missing required fields (an email with no message). Vendor pitches with “I work for X and would love to discuss” phrasing get their own bucket. They’re archived, not deleted, so a real lead with that phrasing can be retrieved on appeal. The screen runs in plain Lambda code with a small banned-list and a regex. No AI involved.

The point of doing both before the AI runs is simple. Token spend is the only line on the bill that grows fast, and most junk is identifiable without it. Free gates first, paid gates only when the message has already proved it’s a real lead.

What this hands to the next post

By the time a lead leaves the intake, it’s in one shape, with a stable ID, no duplicates, no obvious spam, and a source tag the downstream code uses for analytics but not for branching. The next post is about what the qualifier does with that — how it actually reads the lead, extracts intent and urgency and fit signals, and starts to decide which of the four moves applies.

All posts