The legal-document engine
LegalWork's actual legal work is defined as firm-owned agent prompts, not app code. This path walks the two flagship workflows — Word redlining and tabular review — from the user command down to the read-only subagents that do the work.
Extraction rules (these are the quality bar)
- Ground every value in the text. For each column, find the passage that answers it
and quote it in
quote. If you cannot point to a quote, the value must be"Not found". - Never guess or infer beyond the document. Better to return
"Not found"than a plausible hallucination. A wrong value in a review grid is worse than a blank one. valueis SHORT — it goes in a table cell. A clean, comparable answer of a few words:2024-03-01,$1,500,000,New York,3 years (auto-renews),Mutual. No sentences, no explanation — those go inreason.reasonis the LONGER explanation, shown only in the detail sidebar. 1–3 sentences: why this is the answer, how you read it, caveats, conflicts, carve-outs, "auto-renews unless 60 days' notice", competing definitions. This is where a reviewing lawyer looks when the short value isn't enough. Keep the cell terse and put the nuance here.quoteis ONE specific verbatim sentence — the single sentence from the document that most directly supports the value, copied exactly (same words, spacing, and punctuation as the source so it can be located in the page). Not a paragraph, not a paraphrase. One sentence. If the support is a short clause, quote that clause exactly.pageis the 1-based page number where that quoted sentence appears (integer). This drives the PDF page preview. If you cannot determine the page, usenull.locationis a human-checkable pointer:§7.2,Recitals,Signature page,Schedule A. Approximate is fine; empty only when the value is"Not found".- Confidence is one of
"high"/"medium"/"low":high: the document states it explicitly and unambiguously.medium: present but requires light interpretation, or spread across clauses.low: ambiguous, conflicting, or barely supported.
- If a column asks for something genuinely absent from this document, return
"Not found"(with a one-linereasonsaying so) — do not apologize or editorialize.
To find the page number reliably, prefer extracting text with page markers, e.g.
pdftotext -layout "<file>" - | ... keeps layout; or run pdftotext -f N -l N to confirm
a sentence is on page N. Count pages from 1.