Curate Labs

January 23, 2024External researchInformation extractionGraph data

External research: Curate Labs did not author this paper.

An Autoregressive Text-to-Graph Framework for Joint Entity and Relation Extraction recasts joint entity and relation extraction as a graph generation problem. Instead of extracting entities first and classifying relations later, ATG linearizes a graph: spans become nodes, relations become edges, and an encoder-decoder model emits the structured output directly.

The design is interesting because it gives the model a constrained output space. The dynamic vocabulary contains source spans, relation labels, and special graph tokens, while a pointing mechanism grounds predictions back in the text. That matters for extraction systems because hallucinated or ungrounded nodes are often more damaging than ordinary classification errors.

Why we're excited

ATG is strongest as an argument for generation with structure. It does not simply ask a model to produce text that happens to contain triples; it changes the decoding problem so the output is graph-shaped from the beginning.

The paper evaluates on SciERC, ACE05, and CoNLL04 and reports competitive or strong relation extraction results, especially on relation-plus settings where both relation type and entity boundary correctness matter.

Our community read

The practical takeaway is that text-to-graph generation is attractive when the schema is known and the extraction target is bounded. It is less obviously the right answer for long documents, open schemas, or highly fluid ontologies. For those settings, this kind of graph-shaped decoder may need retrieval, chunking, or post-generation canonicalization around it.

Still, the paper is a useful anchor: if the goal is reliable structured extraction, generation should not mean free-form prose.

Source

arXiv: 2401.01326

Community Reading: ATG for Joint Entity and Relation Extraction

Why we're excited

Our community read

Source