Research
Document Understanding
Making long, visual, and semi-structured documents usable.
Important business context often lives in PDFs, statements, filings, contracts, scans, and long notes. Document understanding studies how to parse that material while keeping structure, layout, and provenance intact.
Core Questions
How should a system represent pages, sections, tables, images, spans, and extracted facts together?
What should be preserved from layout and visual structure before a model summarizes or reasons?
How do we make document-derived outputs reviewable by operators and advisors?
Artifacts
DocUnderstand experiments.
Document parsing and provenance notes.
Research reads on graph-aware document extraction.
What It Means
Where It Shows Up
Evidence-Backed Records
Parse long-form and semi-structured documents into evidence-backed records.
Source-Linked Review
Support source-linked review for compliance, finance, and operating decisions.
Document-To-Graph Inputs
Feed extraction and graph workflows without losing page-level context.
Why It Matters
How This Research Gets Used
Applied
Product direction
Research themes shape product workflows, internal evaluation, and open-source implementation choices.Evidence
Reviewable decisions
The work emphasizes assumptions, provenance, and feedback loops that humans can inspect.Browse Research
Making long, visual, and semi-structured documents usable.