Saved in:
| Main Authors: | , |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2504.10613 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866908319996706816 |
|---|---|
| author | Wang, Xing David Leser, Ulf |
| author_facet | Wang, Xing David Leser, Ulf |
| contents | Curation of biomedical knowledge bases (KBs) relies on extracting accurate multi-entity relational facts from the literature - a process that remains largely manual and expert-driven. An essential step in this workflow is retrieving documents that can support or complete partially observed n-ary relations. We present a neural retrieval model designed to assist KB curation by identifying documents that help fill in missing relation arguments and provide relevant contextual evidence.
To reduce dependence on scarce gold-standard training data, we exploit existing KB records to construct weakly supervised training sets. Our approach introduces two key technical contributions: (i) a layered contrastive loss that enables learning from noisy and incomplete relational structures, and (ii) a balanced sampling strategy that generates high-quality negatives from diverse KB records. On two biomedical retrieval benchmarks, our approach achieves state-of-the-art performance, outperforming strong baselines in NDCG@10 by 5.7 and 3.7 percentage points, respectively. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2504_10613 |
| institution | arXiv |
| publishDate | 2025 |
| record_format | arxiv |
| spellingShingle | Enhancing Document Retrieval for Curating N-ary Relations in Knowledge Bases Wang, Xing David Leser, Ulf Information Retrieval Curation of biomedical knowledge bases (KBs) relies on extracting accurate multi-entity relational facts from the literature - a process that remains largely manual and expert-driven. An essential step in this workflow is retrieving documents that can support or complete partially observed n-ary relations. We present a neural retrieval model designed to assist KB curation by identifying documents that help fill in missing relation arguments and provide relevant contextual evidence. To reduce dependence on scarce gold-standard training data, we exploit existing KB records to construct weakly supervised training sets. Our approach introduces two key technical contributions: (i) a layered contrastive loss that enables learning from noisy and incomplete relational structures, and (ii) a balanced sampling strategy that generates high-quality negatives from diverse KB records. On two biomedical retrieval benchmarks, our approach achieves state-of-the-art performance, outperforming strong baselines in NDCG@10 by 5.7 and 3.7 percentage points, respectively. |
| title | Enhancing Document Retrieval for Curating N-ary Relations in Knowledge Bases |
| topic | Information Retrieval |
| url | https://arxiv.org/abs/2504.10613 |