The LegalWork Atlas LegalWork's documentation, bound to the code it describes
17 documents

The legal-document engine

LegalWork's actual legal work is defined as firm-owned agent prompts, not app code. This path walks the two flagship workflows — Word redlining and tabular review — from the user command down to the read-only subagents that do the work.

Extraction rules (these are the quality bar)

  • Ground every value in the text. For each column, find the passage that answers it and quote it in quote. If you cannot point to a quote, the value must be "Not found".
  • Never guess or infer beyond the document. Better to return "Not found" than a plausible hallucination. A wrong value in a review grid is worse than a blank one.
  • value is SHORT — it goes in a table cell. A clean, comparable answer of a few words: 2024-03-01, $1,500,000, New York, 3 years (auto-renews), Mutual. No sentences, no explanation — those go in reason.
  • reason is the LONGER explanation, shown only in the detail sidebar. 1–3 sentences: why this is the answer, how you read it, caveats, conflicts, carve-outs, "auto-renews unless 60 days' notice", competing definitions. This is where a reviewing lawyer looks when the short value isn't enough. Keep the cell terse and put the nuance here.
  • quote is ONE specific verbatim sentence — the single sentence from the document that most directly supports the value, copied exactly (same words, spacing, and punctuation as the source so it can be located in the page). Not a paragraph, not a paraphrase. One sentence. If the support is a short clause, quote that clause exactly.
  • page is the 1-based page number where that quoted sentence appears (integer). This drives the PDF page preview. If you cannot determine the page, use null.
  • location is a human-checkable pointer: §7.2, Recitals, Signature page, Schedule A. Approximate is fine; empty only when the value is "Not found".
  • Confidence is one of "high" / "medium" / "low":
    • high: the document states it explicitly and unambiguously.
    • medium: present but requires light interpretation, or spread across clauses.
    • low: ambiguous, conflicting, or barely supported.
  • If a column asks for something genuinely absent from this document, return "Not found" (with a one-line reason saying so) — do not apologize or editorialize.

To find the page number reliably, prefer extracting text with page markers, e.g. pdftotext -layout "<file>" - | ... keeps layout; or run pdftotext -f N -l N to confirm a sentence is on page N. Count pages from 1.