Part 2 of 7 · Photo tagger series ~4 min read

How a product photo gets read

The tagger only works on photos it can see. So the first job is getting each new photo in and getting it ready. There are two ways a photo gets in: somebody drops it in a Drive folder, or it’s uploaded straight to an S3 bucket. Either way, before any model is called, a small step shrinks the photo and runs plain quality checks. That step matters more than it sounds — it’s what keeps the model from being paid to read a blurry mess, and it’s the cheapest place to catch a photo that should never have been sent.

Key takeaways

  • Two intake lanes feed one queue: a Drive folder and a direct S3 drop.
  • A drive-sync Lambda mirrors new Drive files to S3 every few minutes.
  • Every photo is resized to a small copy before anything else — the model doesn’t need the full file.
  • Plain quality checks (too dark, too blurry, too small) reject bad photos with no model call.
  • Only a photo that passes the checks moves on to the reader in Part 3.

Two lanes, then one resize-and-check

Two intake lanes and a resize-and-check step A diagram with three vertical lane columns at the top and a single unified row at the bottom. Lane one, Drive folder: somebody drops a product photo into the Google Drive folder; the drive-sync Lambda mirrors new files to S3 every few minutes, then an S3 PUT event starts the work. Lane two, S3 drop: a photo is uploaded straight to the S3 drop bucket — useful for shops that already upload to AWS — and the same S3 PUT event starts the work with no Drive in the path at all. Lane three, Resize and check: whichever lane the photo came from, a small intake Lambda makes a smaller copy of it because the full-size file is far bigger than the model needs, then runs plain quality checks — bright enough, sharp enough, large enough, sensible shape — using the thresholds from the rules doc; a photo that fails is rejected here with a reason and never reaches a model. All three lanes converge on the same ready-photo record: a small resized copy in S3 plus a row in DynamoDB with the photo id, source, size, and quality result. A note at the bottom: the resize-and-check step runs before any model — the cheapest place to drop a bad photo is before you pay to read it. Lane 1 · Drive Drive folder • Drop a photo in the Drive folder • drive-sync mirrors to S3 every few min • S3 PUT event starts the work • Easiest for most small teams Lane 2 · direct upload S3 drop • Upload straight to the S3 drop bucket • Same S3 PUT event starts work • No Drive in the path at all • For AWS-native shops Lane 3 · before any model Resize and check • Make a small copy of the photo • Check bright, sharp, big enough • Fail → reject here with a reason • Pass → on to the reader Ready-photo record (passed the checks) photo id · source · small copy in S3 · width · height · quality result stored as a DynamoDB row — the reader picks it up from here to reader, one per photo The resize-and-check runs before any model — the cheapest place to drop a bad photo is before you read it.
Fig 2. Two lanes converge on one resize-and-check step. The Drive lane and the S3-drop lane both end in the same S3 PUT event; the intake Lambda shrinks the photo and runs plain quality checks. Only a photo that passes becomes a ready-photo record for the reader.

Lane 1: the Drive folder

The simplest lane for most teams. Drop a photo into the shared Drive folder and walk away. A small Lambda — drive-sync — runs every few minutes, looks for files it hasn’t seen before, and copies each one to s3://pt-photo-drop/ using the Google Drive API (its credentials live in Secrets Manager). The copy into S3 fires an S3 PUT event, and that event is what actually starts the tagger. The Drive folder stays the place humans interact with; S3 is where the work happens.

This lane covers the common case: a team that photographs products on a phone or camera, dumps the shots into a shared Drive folder, and doesn’t want to learn anything new. They already use Drive; nothing changes for them.

Lane 2: the direct S3 drop

Some shops already upload images to AWS — their store pulls product images from an S3 bucket, or they have a build step that puts photos there. For them, the Drive hop is pointless. Lane 2 lets a photo be uploaded straight to the same s3://pt-photo-drop/ bucket (or a prefix in it), and the exact same S3 PUT event starts the work. There’s no second code path to maintain — both lanes end at the same event, so everything downstream is identical.

A shop can use one lane, the other, or both. A team might drop phone photos in Drive and have their web build push studio shots straight to S3; both kinds of photo flow through the same checks and the same reader.

Lane 3: resize, then check — before any model

This is the step that earns its keep. When the S3 PUT event fires, a small intake Lambda loads the photo and does two things. First, it makes a smaller copy — a product photo straight off a phone can be several megabytes and many millions of pixels, and the model reads it just as well at a fraction of that size. The smaller copy is cheaper to send, faster to process, and is what every later step uses; the original is kept untouched.

Second, it runs plain quality checks against the thresholds in the rules doc. Is the photo bright enough, or is it nearly black? Is it sharp, or is it a blur? Is it large enough to be a real product shot, or a tiny thumbnail? Is the shape sensible, or is it a long thin banner that’s clearly not a product? These are simple measurements — no model, no AI, just arithmetic on the pixels. A photo that fails is rejected right here, with a reason written to its record (“too dark”), and it never reaches a model. Part 4 goes deeper on flagging; the point for now is that the obvious rejects are caught before anyone pays to read them.

A photo that passes becomes a ready-photo record: the small copy in S3, plus a DynamoDB row with the photo id, where it came from, its size, and its quality result. That record is what the reader picks up in the next post.

Why a record, not an immediate model call

The intake step doesn’t call the model itself — it writes a record and lets the next step pick it up. That’s deliberate. Photos arrive in bursts: a team uploads forty shots at once after a photoshoot. If each upload tried to call the model the instant it landed, a big batch could hit rate limits or pile up. Instead, ready-photo records go onto a queue, and the reader pulls them at a steady pace. A burst of forty photos becomes forty calm records that get read one after another, and a failure on one photo never blocks the rest.

Next post: how the reader takes a ready photo, calls Bedrock vision once, and drafts the five listing fields — title, alt text, tags, category, and description — each with a confidence score.

All posts