Curate Labs

February 6, 2026External researchDocument relation extractionLong tail

External research: Curate Labs did not author this paper.

DOREMI: Optimizing Long Tail Predictions in Document-Level Relation Extraction targets a practical failure mode: relation extractors often perform acceptably on frequent relations and poorly on rare ones.

Instead of proposing a new base relation model, DOREMI iteratively selects informative distant-supervision examples for targeted manual annotation, then retrains models with better-tailored data.

Why we're excited

Long-tail relation extraction is where many systems fail in production. The rare relations may be exactly the ones that matter in compliance, intelligence, science, law, or audit workflows.

DOREMI is useful because it treats annotation as a scarce resource to allocate deliberately, rather than a uniform labeling burden.

Our community read

The paper is less flashy than a new architecture and more operationally relevant. It says: if rare relations matter, build an annotation loop aimed at rare-relation recovery.

The limitation is that it still requires human work. But that is not a weakness so much as an honest accounting of reliability.

Source

arXiv: 2601.11190

Community Reading: DOREMI for Long-Tail DocRE

Why we're excited

Our community read

Source