An applicant screener on AWS for a few dollars a month
You post one job and 300 resumes show up in a week. Most are a poor fit; a handful are exactly who you need; and the good ones are buried somewhere in the pile. A small team can’t read all of them carefully, so the careful reading gets skipped, and good people get missed. This post walks through the design of a small screener that reads every resume against the must-haves you wrote down, gives a clear yes/maybe/no with the reasons, and hands the strong matches to a human with a short summary. It never rejects anyone on its own. A person makes every call.
Key takeaways
- Three sources for applications: an apply inbox, a careers-form upload, and a job-board export.
- Every applicant ends in one of three piles: yes, maybe, or no — each with the reasons from the resume.
- It only judges against the job-related must-haves you write down. Name, age, school, and photo are stripped first.
- A human makes every decision. The screener sorts and explains; it never rejects anyone on its own.
- Designed on AWS for about $2/month at typical small-business volume.
The whole system on one page
Before any code, here’s the shape of what we’re designing.
What you set up once (the outside)
- Applications. Three ways a resume gets in, all covered in Part 2. The first is an apply inbox — candidates email a resume to a dedicated address. The second is the upload box on your careers page. The third is an export from a job board you already post to. All three land as a resume file (PDF, Word, or plain text) plus a few facts like the role applied for and when it arrived.
- A role rubric. One short Google Doc per role in a Drive folder. It lists the must-haves — the things a person genuinely needs to do the job, each written as a plain, job-related requirement (“has run payroll for a team of 10+”, “holds a valid forklift license”, “can work Saturdays”). It lists the nice-to-haves. And it sets the pass marks: how many must-haves earn a yes, and how many earn a maybe. The doc also names the personal fields to strip before scoring — name, age, gender, photo, home address, and school name — so the screener can’t judge on them even by accident.
- Hiring manager. The person who reviews strong matches and makes every actual decision. They get a short summary per candidate, the resume, and the reasons each must-have was met or missed, with three buttons: advance, hold, and pass. The screener routes; the manager decides.
What runs on every application (the inside)
- The intake. Three sources feed one queue. Each resume is turned into plain text — a Word or text file reads straight through; a scanned PDF goes through Amazon Textract first to pull the words off the page. Then the intake strips the personal fields named in the rubric, so what reaches the reader is the work history and skills, not the name or the photo. A copy of the original is kept for the human; only the stripped text is scored.
- The reader. Runs once per application. It reads the stripped text and, for each must-have in the rubric, decides whether the resume meets it — quoting the exact line that does, or noting that it’s missing. This is the one place a model earns its keep: reading messy resume prose and matching it to a plain requirement. The reader does not pick the label. Plain Python counts the met must-haves against your pass marks and lands on yes, maybe, or no. The cut-off is yours, and it’s the same for every applicant.
- The router. Reads the label and decides where the candidate goes. A yes becomes a strong match: it gets a short plain summary and lands in the hiring manager’s review queue at the top. A maybe lands in the same queue, lower down. A no is parked in a reviewable list — not deleted, not auto-rejected — so a human can still look. Every routing decision, and the criteria behind it, is written to DynamoDB so the whole round can be audited later.
In plain words
You’re hiring a bookkeeper. The must-haves are: two years of bookkeeping, has used Xero or QuickBooks, and can work in-office two days a week. A resume comes in by email. The intake turns the PDF into text and strips the name, age, and the candidate’s photo. The reader checks each must-have: “Bookkeeper at Acme, 2021–2024” meets the first, “Reconciled accounts in QuickBooks Online” meets the second, and there’s no mention of in-office days — missing. Two of three met. Your rubric says two must-haves earns a maybe, three earns a yes. So this one is a maybe, with the two quotes and the one gap shown plainly. The hiring manager opens the queue, sees the maybe, reads the gap, and decides it’s worth a quick call to ask about in-office days. The screener didn’t reject anyone. It saved the manager from reading 300 resumes to find this one.
The cost of running this is about $2 a month at small-business volume. The cost of not running it is the strong candidate buried on page nine that nobody reached, or the slow, uneven reading that lets bias creep in when a tired human skims 200 resumes at 6pm.
Design rules that shaped every decision
- A human makes every decision. The screener sorts into yes/maybe/no and explains; it never rejects anyone on its own.
- It judges against the job-related must-haves only. Name, age, gender, photo, address, and school are stripped before scoring.
- Every label ships with reasons — the resume line that met each must-have, or a note that it’s missing. No black box.
- The pass marks are yours. The model reads; your rubric sets the cut-off, the same for every applicant.
- The rubric lives in Drive. Changing a must-have or a pass mark doesn’t need a deploy.
- Every decision is logged with the criteria used, so any hiring round can be audited later.
Why this shape
Most small teams screen resumes one of two ways: somebody reads every one (which doesn’t scale past a few dozen and gets less careful as the pile grows), or somebody skims for a keyword and tosses the rest (which is fast, unfair, and misses good people who phrased things differently). Both get worse under pressure. The keyword skim is the more dangerous one — it feels rigorous and isn’t. It rewards resume buzzwords over real fit, and it quietly filters on whatever the skimmer’s eye lands on first, which is often a name or a school.
The setup above keeps the must-haves where the team already writes them — a doc — but adds a small system that reads every resume the same careful way, against the same job-related list, with the personal fields removed. It explains every call so a human can check it. It sends the strong matches up with a summary so the manager spends their time on the candidates worth time. And it never, ever closes the door on anyone by itself. The screener does the reading; the human does the deciding.
The next four posts walk through each piece in turn: how an application gets read, how an applicant gets scored, how a strong match reaches the hiring manager, and how a hiring decision actually gets made. One diagram per post. A cost breakdown and a final engineering reference at the end.
All posts