:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Ascione, Grazia Sveva, Sterzi, Valerio
Format:	Preprint
Published:	2024
Subjects:	Computation and Language Information Retrieval Machine Learning
Online Access:	https://arxiv.org/abs/2403.16630
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Presenting Terrorizer: an algorithm for consolidating company names in patent assignees
by: Ascione, Grazia Sveva, et al.
Published: (2024)

From scratch to silver: Creating trustworthy training data for patent-SDG classification using Large Language Models
by: Ascione, Grazia Sveva, et al.
Published: (2025)

Benchmarking pre-trained text embedding models in aligning built asset information
by: Shahinmoghadam, Mehrzad, et al.
Published: (2024)

A comparison of latent semantic analysis and correspondence analysis of document-term matrices
by: Qi, Qianqian, et al.
Published: (2021)

KGLink: A column type annotation method that combines knowledge graph and pre-trained language model
by: Wang, Yubo, et al.
Published: (2024)

ELIXIR: Efficient and LIghtweight model for eXplaIning Recommendations
by: Kabongo, Ben, et al.
Published: (2025)

Large language model as user daily behavior data generator: balancing population diversity and individual personality
by: Li, Haoxin, et al.
Published: (2025)

Challenges and Considerations in Annotating Legal Data: A Comprehensive Overview
by: Darji, Harshil, et al.
Published: (2024)

A Trio Neural Model for Dynamic Entity Relatedness Ranking
by: Nguyen, Tu, et al.
Published: (2018)

Revisit and Outstrip Entity Alignment: A Perspective of Generative Models
by: Guo, Lingbing, et al.
Published: (2023)

A Benchmark Suite of Reddit-Derived Datasets for Mental Health Detection
by: Hasan, Khalid, et al.
Published: (2026)

mmBERT: A Modern Multilingual Encoder with Annealed Language Learning
by: Marone, Marc, et al.
Published: (2025)

LTR-ICD: A Learning-to-Rank Approach for Automatic ICD Coding
by: Mansoori, Mohammad, et al.
Published: (2025)

A Platform for Investigating Public Health Content with Efficient Concern Classification
by: Li, Christopher, et al.
Published: (2025)

GraphEx: A Graph-based Extraction Method for Advertiser Keyphrase Recommendation
by: Mishra, Ashirbad, et al.
Published: (2024)

Concept Drift Adaptation in Text Stream Mining Settings: A Systematic Review
by: Garcia, Cristiano Mesquita, et al.
Published: (2023)

Lightweight Query Routing for Adaptive RAG: A Baseline Study on RAGRouter-Bench
by: Bansal, Prakhar, et al.
Published: (2026)

LegalRAG: A Hybrid RAG System for Multilingual Legal Information Retrieval
by: Kabir, Muhammad Rafsan, et al.
Published: (2025)

Complete Evidence Extraction with Model Ensembles: A Case Study on Medical Coding
by: Beckh, Katharina, et al.
Published: (2025)

LaMP-QA: A Benchmark for Personalized Long-form Question Answering
by: Salemi, Alireza, et al.
Published: (2025)

Dynamic-KGQA: A Scalable Framework for Generating Adaptive Question Answering Datasets
by: Dammu, Preetam Prabhu Srikar, et al.
Published: (2025)

SQaLe: A Large Text-to-SQL Corpus Grounded in Real Schemas
by: Wolff, Cornelius, et al.
Published: (2025)

Atomic Information Flow: A Network Flow Model for Tool Attributions in RAG Systems
by: Gao, James, et al.
Published: (2026)

What's happening in your neighborhood? A Weakly Supervised Approach to Detect Local News
by: Shah, Deven Santosh, et al.
Published: (2023)

How do Large Language Models Understand Relevance? A Mechanistic Interpretability Perspective
by: Liu, Qi, et al.
Published: (2025)

A Language-Driven Framework for Improving Personalized Recommendations: Merging LLMs with Traditional Algorithms
by: Goldstein, Aaron, et al.
Published: (2025)

Combining topic modelling and citation network analysis to study case law from the European Court on Human Rights on the right to respect for private and family life
by: Mohammadi, M., et al.
Published: (2024)

Towards Realistic Synthetic User-Generated Content: A Scaffolding Approach to Generating Online Discussions
by: Balog, Krisztian, et al.
Published: (2024)

S2 Chunking: A Hybrid Framework for Document Segmentation Through Integrated Spatial and Semantic Analysis
by: Verma, Prashant
Published: (2025)

A Dynamic Framework for Semantic Grouping of Common Data Elements (CDE) Using Embeddings and Clustering
by: Krishnamurthy, Madan, et al.
Published: (2025)

A Breadth-First Catalog of Text Processing, Speech Processing and Multimodal Research in South Asian Languages
by: Gupta, Pranav
Published: (2024)

Enhancing Q&A Text Retrieval with Ranking Models: Benchmarking, fine-tuning and deploying Rerankers for RAG
by: Moreira, Gabriel de Souza P., et al.
Published: (2024)

NANOGPT: A Query-Driven Large Language Model Retrieval-Augmented Generation System for Nanotechnology Research
by: Chandrasekhar, Achuth, et al.
Published: (2025)

Do Recommender Systems Really Leverage Multimodal Content? A Comprehensive Analysis on Multimodal Representations for Recommendation
by: Pomo, Claudio, et al.
Published: (2025)

NLP-Powered Repository and Search Engine for Academic Papers: A Case Study on Cyber Risk Literature with CyLit
by: Zhang, Linfeng, et al.
Published: (2024)

M3: A Multi-Task Mixed-Objective Learning Framework for Open-Domain Multi-Hop Dense Sentence Retrieval
by: Bai, Yang, et al.
Published: (2024)

PashtoCorp: A 1.25-Billion-Word Corpus, Evaluation Suite, and Reproducible Pipeline for Low-Resource Language Development
by: Rahman, Hanif
Published: (2026)

Investigating disaster response through social media data and the Susceptible-Infected-Recovered (SIR) model: A case study of 2020 Western U.S. wildfire season
by: Ma, Zihui, et al.
Published: (2023)

SLMRec: Distilling Large Language Models into Small for Sequential Recommendation
by: Xu, Wujiang, et al.
Published: (2024)

SetCSE: Set Operations using Contrastive Learning of Sentence Embeddings
by: Liu, Kang
Published: (2024)