How risky clauses get flagged

Key takeaways

A clause search finds the paragraphs that match the always-flag list, so the model reads only a few.
Sonnet 4.6 — the stronger model — explains each flagged clause: what it says and why it matters.
Every flag quotes the exact clause. A flag with no quote is dropped, same rule as the term pull.
The model says what to look at, never what to do. It is not advice and the wording stays that way.
A high-stakes clause trips the last gate: the flag says “run this past a lawyer before signing.”

Four gates on every flag

Fig 4. Four gates between a clause and a flag. Search the clauses for the topics that matter. Read each with the stronger model. Check every note quotes real text. Score the stakes and, where they’re high, say to call a lawyer. Only flags that pass all four reach the summary.

Gate 1: clause search

The system doesn’t hand the whole contract to the stronger model — that would be slow and costly, and most clauses are boilerplate. Instead it searches. Each topic on the always-flag list (auto-renewal, penalties, liability, personal guarantees, exclusivity, termination) is turned into an embedding — a list of numbers that captures its meaning — using Amazon Titan Text Embeddings V2. Every clause from the intake is embedded the same way. A search in Amazon S3 Vectors, AWS’s built-in vector store, matches each topic to the clauses that mean the same thing, even when the contract uses different words (“evergreen term” instead of “auto-renewal,” “limitation of liability” instead of “liability cap”).

The result is a short list: the handful of clauses that actually concern the topics you care about. Those are the only clauses the stronger model reads. A forty-page contract might surface six or eight clauses worth a look; the rest never reach Gate 2.

Gate 2: the Sonnet read

Now the stronger model goes to work, but only on those few clauses. Bedrock Sonnet 4.6 reads each matched clause and writes a short plain-English note: what the clause says, and why it matters to you, in two or three sentences. For an auto-renewal clause: “This contract renews itself for another 24 months unless you cancel between 90 and 60 days before it ends. Miss that window and you’re locked in for two more years.” For a liability cap: “If something goes wrong, the most you could claim back from the vendor is one month’s fees — about $1,450 — no matter how big the actual loss.”

Sonnet is used here, and only here, because this is the reading that needs judgment. Spotting that a notice window is unusually short, or that a liability cap is low relative to what’s at stake, is exactly the kind of between-the-lines reading the cheap model can’t be trusted with. There are only a few clauses per contract, so the stronger model’s cost stays small — the math is in Part 6.

Gate 3: the quote check

Same rule as the term pull, applied to the flags: every note has to quote the exact text of the clause it’s about. After the Sonnet read, a plain Python check confirms the quoted text actually appears in that clause. If a note quotes nothing, or quotes words that aren’t in the clause, the whole flag is dropped — not shown with a warning, dropped. A flag you can’t trace back to the contract is worse than no flag, because it teaches the reader to doubt the ones that are real. The quote is what makes a flag checkable: the owner can read the clause for themselves, right there in the summary, and see the system isn’t making it up.

Gate 4: the escalate check

The last gate decides when a flag should send you to a lawyer. The rules doc holds the thresholds in plain prose: any personal guarantee at all, any liability cap above the contract’s value, any auto-renewal with a notice window shorter than 30 days, any clause that hands away exclusivity or your data. Each kept flag is scored against those thresholds. If it crosses one, the flag gets an extra line: “This is high-stakes — run it past a lawyer before you sign.”

This is the system knowing its own limits. It will happily tell you what an indemnity clause says and why it matters. It will not tell you whether to accept it, negotiate it, or walk away — that is legal advice, and the system doesn’t give legal advice. When the stakes are high enough that the answer really matters, it does the one responsible thing it can: it points you at a human who is allowed to answer. Part 5 makes that boundary — grounded, never advice, always a human in the loop — into the rule the whole summary lives by.

Next post: how the whole summary stays grounded in the actual text, the not-legal-advice banner, and the approval step before anything is filed or sent.

All posts