Part 7 of 7 · Translation relay series ~8 min read

Engineering reference: the translation relay architecture

Same system, drawn for engineers. Region, service names, resource identifiers, Bedrock model IDs, Lambda inventory, IAM scopes, the SES inbound rule set, the SQS lanes, the DynamoDB schemas, and the masking pipeline. Read alongside the previous six posts; this one’s the build sheet.

Region and account shape

Default region: ap-southeast-1 (Singapore). SES inbound, Bedrock cross-Region inference, and Lambda Function URLs are all in good shape there, and it keeps round-trip latency low for an Asia-Pacific customer base; swap to whichever region is closest to your buyers. Bedrock calls go through the Global cross-Region inference profile, so capacity isn’t pinned to one region. One AWS account dedicated to the relay keeps the IAM blast radius small and lets a single AWS Budgets alarm cover the whole system. No VPC; every Lambda runs with default networking and reaches AWS service endpoints and the Google APIs over the public internet.

Topology

AWS topology of the translation relay A topology diagram with three regions stacked vertically inside one AWS account boundary. Top region: ingress. Three boxes show the intake paths — an SES inbound rule set tr-inbound-rules with action S3 PUT to s3://tr-raw-mime/ that triggers the intake-ses Lambda, a web widget that posts to the intake-web Lambda Function URL, and a drive-sync Lambda triggered every 15 minutes by EventBridge Scheduler that mirrors the glossary sheet and voice note to s3://tr-glossary-source/. All three feed the conversation thread store. Middle region: detect and translate-in. The detect Lambda reads the cleaned text, runs a cheap language check and a Haiku 4.5 fallback, writes the language to the thread, and enqueues to the tr-translate SQS queue with a dead-letter queue; the translate-in Lambda consumes the queue, masks protected terms, calls Bedrock Haiku 4.5, re-runs weak spans on Bedrock Sonnet 4.6, restores terms, and writes the staff-facing translation to DynamoDB. Bottom region: translate-back and send. The translate-back Lambda is triggered by the staff prepare action via a Function URL; it masks terms, translates the reply with Haiku and Sonnet, runs the round-trip check, restores terms, and stores the prepared reply. The send Lambda is triggered when a human approves; it delivers via SES outbound for email threads or returns the reply to the web widget, and writes to tr-threads and tr-audit. CloudWatch Logs collects from every Lambda at 7-day retention. Across the right edge: a small box labelled AWS Budgets alarm at $25 monthly threshold, posting to SNS topic tr-cost-alarm. A note at the bottom: a person always approves the send — and every translation and edit is logged to tr-audit. Ingress SES inbound rule set tr-inbound-rules action: S3 PUT s3://tr-raw-mime/ trigger: intake-ses Web widget posts to Function URL intake-web Lambda carries thread id CORS to your site Lambda · drive-sync every 15 min Sheets API → s3://tr-glossary-source/ glossary + voice Conversation thread store tr-threads · both languages kept Detect & translate-in Lambda · detect clean + read language cheap pass + Haiku writes lang to thread enqueue tr-translate SQS · tr-translate decouples bursts from model calls dead-letter queue tr-translate-dlq Lambda · translate-in mask terms Haiku 4.5 translate Sonnet 4.6 weak spans restore → thread Translate-back & send Lambda · translate-back staff prepare action, mask + Haiku/Sonnet, round-trip check, restore terms Staff review UI reply + read-back + restored figures approve click → Function URL Lambda · send SES outbound for email, widget for chat; writes tr-threads, tr-audit A person always approves the send — every translation and edit is logged to tr-audit.
Fig 7. AWS topology, in three regions of the diagram: ingress (two channels in plus the glossary sync), detect and translate-in (the queue decouples bursts from model calls), translate-back and send (the human approves and the reply ships). Every Lambda is event-, queue-, or request-driven; nothing is synchronous-chained across model calls.

Lambda functions

All Lambdas use the arm64 architecture, the smallest memory size that meets latency targets (typically 256 MB, 512 MB for the model-calling functions), Python 3.14 runtime, and CloudWatch Logs at 7-day retention. Each function has its own least-privilege IAM role. None run inside a VPC.

  • intake-ses — S3 PUT trigger on s3://tr-raw-mime/. Parses the MIME, extracts the plain-text body (falls back to HTML stripped to text), strips the signature, quoted history, and footers, resolves the thread by sender + normalized subject, and writes the cleaned message to tr-threads. Then invokes detect asynchronously. Memory: 256 MB. Timeout: 30 s.
  • intake-web — Lambda Function URL, AuthType: NONE with a per-site signed widget token verified in-handler and CORS locked to your domains. Accepts {thread_id, text} from the chat widget, applies the same cleanup, writes to tr-threads, and invokes detect. Memory: 256 MB. Timeout: 15 s.
  • drive-sync — EventBridge Scheduler target, every 15 minutes. Uses the Google Drive + Sheets API (service-account credentials in Secrets Manager under tr/drive/sa) to export the glossary sheet as CSV and the voice note as text, writing to s3://tr-glossary-source/ only when changed. Memory: 256 MB. Timeout: 30 s.
  • detect — invoked by the intake Lambdas. Runs a cheap script-and-statistics language check; for short, mixed, or code-heavy text, calls Bedrock Haiku 4.5 as a fallback. Writes language and language_confidence to the thread. If confidence stays below threshold, marks the turn language_unclear for human review instead of enqueuing. Otherwise sends a message to the tr-translate SQS queue. Memory: 512 MB. Timeout: 30 s.
  • translate-in — SQS event source on tr-translate (batch size 1, partial-batch responses on). Loads the glossary from s3://tr-glossary-source/, masks protected terms, numbers, prices, and IDs (Part 5), calls Bedrock Haiku 4.5 for the per-sentence translation + confidence, re-runs sub-threshold passages on Bedrock Sonnet 4.6, restores the masked spans, and writes the staff-facing translation and per-passage confidence to tr-threads. Logs each mask/restore swap to tr-audit. Memory: 512 MB. Timeout: 60 s.
  • translate-back — Lambda Function URL, invoked by the staff prepare action (Slack-style signed request or the internal review UI session). Masks the reply, translates into the thread’s customer language with Haiku 4.5 + Sonnet 4.6, runs the round-trip check (translate the result back to the staff language), restores terms, and stores the prepared reply + read-back on the thread. Does not send. Memory: 512 MB. Timeout: 60 s.
  • send — Lambda Function URL, invoked only by the human approve action. Verifies the prepared-reply hash matches what the agent approved (so an edit-after-prepare can’t slip through unreviewed), then delivers: SES SendRawEmail for an email thread, or returns the reply to the widget poll for a chat thread. Writes the final reply (both languages) to tr-threads and an action: sent row to tr-audit. Memory: 256 MB. Timeout: 15 s.
  • summary — EventBridge Scheduler target, weekly. Reads the week’s tr-threads and tr-audit; calls Bedrock Haiku 4.5 to write a short report (volume by language, share of passages that needed Sonnet, count of human-flagged messages); emails it via SES to the configured stakeholder list. Memory: 512 MB.

Storage

  • DynamoDB · tr-threads — one item per conversation turn. PK thread_id; sort key turn_ts; attributes: channel (email/web), direction (in/out), customer_lang, team_lang, original_text, translated_text, passage_confidence (list), status. On-demand.
  • DynamoDB · tr-swaps — one row per masked span. PK (thread_id, turn_ts); sort key placeholder; attributes: kind (glossary/number/id), original, restored. On-demand. Proves no figure changed in translation.
  • DynamoDB · tr-audit — one row per action of any kind (translate, prepare, edit, approve, send, flag). PK (thread_id, ts); attributes: action, by_user, model, before, after. On-demand. No TTL — long-term audit trail.
  • S3 · tr-glossary-source — mirrored glossary CSV and voice note as plain text. Versioning enabled, so a bad glossary edit rolls back in one click.
  • S3 · tr-raw-mime — raw inbound email MIME. Lifecycle to Glacier at 30 days; expiry at 2 years.

Bedrock

  • Cheap path. anthropic.claude-haiku-4-5-20251001-v1:0 via the Global cross-Region inference profile global.anthropic.claude-haiku-4-5-20251001-v1:0. Callsites: detect fallback, translate-in, translate-back, the round-trip check, and summary.
  • Heavy path. anthropic.claude-sonnet-4-6-20250115-v1:0 via global.anthropic.claude-sonnet-4-6-20250115-v1:0, called only on sub-threshold passages from translate-in and translate-back — never on a whole message.
  • Embeddings. Not used. The relay translates; it doesn’t retrieve. No Knowledge Base, no S3 Vectors. (Titan Text Embeddings V2 would be the choice if a future phrase-memory cache needed semantic lookup, but exact-match caching covers the common case at lower cost.)
  • Prompts. Strict and short: preserve placeholders verbatim, translate faithfully and plainly, return per-sentence confidence as JSON. Temperature near zero for determinism.

SQS lanes

  • tr-translate — standard queue between detect and translate-in. Decouples a burst of inbound messages from the rate of model calls, so a spike never throttles Bedrock or drops a message. Visibility timeout 90 s (six times the consumer’s typical runtime). Max receive count 3.
  • tr-translate-dlq — dead-letter queue for tr-translate. Anything that fails three times lands here with the full context for inspection; a CloudWatch alarm on queue depth > 0 pages the on-call admin. Most DLQ entries are a malformed message or a transient Bedrock throttle, both safe to replay.

SES inbound and outbound

  • Set the MX record on a dedicated subdomain (e.g. support.your-company.com) to inbound-smtp.ap-southeast-1.amazonaws.com.
  • SES inbound rule set tr-inbound-rules: one rule with recipient support@your-company.com → spam scan → S3 PUT to s3://tr-raw-mime/<message-id> → stop. The S3 PUT triggers intake-ses.
  • SES outbound for replies: verify a sender identity at support@your-company.com with DKIM and SPF on the parent domain, and set a custom Reply-To so the customer’s next message threads back in. Out of sandbox by request.

IAM (least privilege per Lambda)

Each Lambda has its own role with policies scoped to exact ARNs. Sketch:

  • detect role: dynamodb:UpdateItem on tr-threads; sqs:SendMessage on tr-translate; bedrock:InvokeModel on the Haiku ARN only. No Sonnet, no SES.
  • translate-in role: sqs:ReceiveMessage + DeleteMessage on tr-translate; s3:GetObject on the glossary bucket; bedrock:InvokeModel on the Haiku and Sonnet ARNs; dynamodb:PutItem on tr-threads, tr-swaps, tr-audit.
  • translate-back role: same Bedrock + glossary access as translate-in; dynamodb:PutItem on tr-threads, tr-swaps, tr-audit. No SES — it cannot send, only prepare.
  • send role: ses:SendRawEmail from the verified sender identity; dynamodb:PutItem on tr-threads and tr-audit; dynamodb:GetItem to verify the approved-reply hash. No bedrock:* — the send path never calls a model.
  • drive-sync role: secretsmanager:GetSecretValue on the Google service-account secret; s3:PutObject on the glossary bucket; outbound network to www.googleapis.com.

Masking pipeline

The mask/restore code is a shared library imported by translate-in and translate-back, so both directions behave identically. Order matters: IDs and codes are matched first (most specific), then currency and number patterns, then glossary terms (longest match first to avoid partial hits). Each match is replaced by a typed placeholder — [[ID_1]], [[NUM_2]], [[TERM_3]] — and recorded in an in-memory map. After the model returns, the restorer validates that every placeholder it emitted still exists exactly once (a dropped or duplicated placeholder fails the turn and routes to human review), then swaps each back to the original value and writes the swap to tr-swaps. Number and ID patterns are unit-tested against a fixture of locale edge cases (comma/period decimals, hash-prefixed orders, alphanumeric SKUs) so a regex change can’t silently weaken the guardrail.

Observability and cost gates

  • CloudWatch Logs: all Lambdas, 7-day retention, structured JSON. Subscription filter on "error" + "throttle" + "placeholder_mismatch" to a CloudWatch metric for alerting.
  • Alarms: tr-translate-dlq depth > 0; translate-in error rate > 1% in 24h; placeholder_mismatch > 0 (a masking bug is a money bug); Bedrock throttle count rising.
  • X-Ray: off by default. Not worth the cost at SMB volume.
  • AWS Budgets: $25/month threshold, alarm at 80% and 100%, posts to SNS topic tr-cost-alarm subscribed to the on-call admin’s email.

Config and secrets

Google service-account credentials for Drive and Sheets live in Secrets Manager under tr/drive/sa. The widget signing key is under tr/widget/key; the SES sender identity lives in IAM and the verified-domain config. The team’s working language, the confidence thresholds for Sonnet escalation and for human-flagging, the list of glossary categories, and the customer-facing “from” name all live in Parameter Store under /tr/config/. Lambdas fetch config on cold start and cache it for the lifetime of the execution environment.

Deploy

GitHub Actions with OIDC into a deploy role (no long-lived keys) and AWS SAM for the stack. The opinionated bits: deploy the SES rule set as a separate stack (rule-set changes affect mail flow), turn on S3 versioning for tr-glossary-source so a bad glossary edit can be rolled back in one click, and gate deploys on the masking library’s unit tests passing — that test suite is the thing standing between a regex tweak and a changed price in a customer’s inbox. Total deployable surface: around eight Lambdas, three DynamoDB tables, two S3 buckets, one SQS queue plus its DLQ, one SES rule set, a few Function URLs, and one Budgets alarm.

That’s the full system. Six narrative posts and this engineering reference. If you want to talk about adapting it for your business, see Work with me.

All posts