:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Lee, Juhwan, Kim, Jisu
Format:	Preprint
Published:	2024
Subjects:	Computation and Language Artificial Intelligence
Online Access:	https://arxiv.org/abs/2405.13008
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Strategic Data Ordering: Enhancing Large Language Model Performance through Curriculum Learning
by: Kim, Jisu, et al.
Published: (2024)

Spectral Tempering for Embedding Compression in Dense Passage Retrieval
by: Li, Yongkang, et al.
Published: (2026)

Adverb Is the Key: Simple Text Data Augmentation with Adverb Deletion
by: Choi, Juhwan, et al.
Published: (2024)

GPTs Are Multilingual Annotators for Sequence Generation Tasks
by: Choi, Juhwan, et al.
Published: (2024)

SoftEDA: Rethinking Rule-Based Data Augmentation with Soft Labels
by: Choi, Juhwan, et al.
Published: (2024)

AutoAugment Is What You Need: Enhancing Rule-based Augmentation Methods in Low-resource Regimes
by: Choi, Juhwan, et al.
Published: (2024)

Beyond Independent Passages: Adaptive Passage Combination Retrieval for Retrieval Augmented Open-Domain Question Answering
by: Ko, Ting-Wen, et al.
Published: (2025)

Plug-in and Fine-tuning: Bridging the Gap between Small Language Models and Large Language Models
by: Kim, Kyeonghyun, et al.
Published: (2025)

CoBA: Counterbias Text Augmentation for Mitigating Various Spurious Correlations via Semantic Triples
by: Jin, Kyohoon, et al.
Published: (2025)

Multi-News+: Cost-efficient Dataset Cleansing via LLM-based Data Annotation
by: Choi, Juhwan, et al.
Published: (2024)

Enhancing Effectiveness and Robustness in a Low-Resource Regime via Decision-Boundary-aware Data Augmentation
by: Jin, Kyohoon, et al.
Published: (2024)

Medal Matters: Probing LLMs' Failure Cases Through Olympic Rankings
by: Choi, Juhwan, et al.
Published: (2024)

Beyond Single-User Dialogue: Assessing Multi-User Dialogue State Tracking Capabilities of Large Language Models
by: Song, Sangmin, et al.
Published: (2025)

Towards LLM-Centric Multimodal Fusion: A Survey on Integration Strategies and Techniques
by: An, Jisu, et al.
Published: (2025)

UniGen: Universal Domain Generalization for Sentiment Classification via Zero-shot Dataset Generation
by: Choi, Juhwan, et al.
Published: (2024)

Passage Retrieval of Polish Texts Using OKAPI BM25 and an Ensemble of Cross Encoders
by: Pokrywka, Jakub
Published: (2024)

From Ranking to Selection: A Simple but Efficient Dynamic Passage Selector for Retrieval Augmented Generation
by: Meng, Siyuan, et al.
Published: (2025)

QPaug: Question and Passage Augmentation for Open-Domain Question Answering of LLMs
by: Kim, Minsang, et al.
Published: (2024)

Semantic Tokens in Retrieval Augmented Generation
by: Suro, Joel
Published: (2024)

SLM-Based Agentic AI with P-C-G: Optimized for Korean Tool Use
by: Jeon, Changhyun, et al.
Published: (2025)

Incorporating Domain Knowledge into Materials Tokenization
by: Oh, Yerim, et al.
Published: (2025)

Is It Novel and Why? Fine-Grained Patent Novelty Prediction Based on Passage Retrieval
by: Knappich, Valentin, et al.
Published: (2026)

DERA: Dense Entity Retrieval for Entity Alignment in Knowledge Graphs
by: Wang, Zhichun, et al.
Published: (2024)

Deep Learning Based Dense Retrieval: A Comparative Study
by: Zhong, Ming, et al.
Published: (2024)

Beyond Static Benchmarks: Synthesizing Harmful Content via Persona-based Simulation for Robust Evaluation
by: Lee, Huije, et al.
Published: (2026)

One vs. Many: Comprehending Accurate Information from Multiple Erroneous and Inconsistent AI Generations
by: Lee, Yoonjoo, et al.
Published: (2024)

Selective Demonstration Retrieval for Improved Implicit Hate Speech Detection
by: Kim, Yumin, et al.
Published: (2025)

Dense Passage Retrieval: Is it Retrieving?
by: Reichman, Benjamin, et al.
Published: (2024)

Dense X Retrieval: What Retrieval Granularity Should We Use?
by: Chen, Tong, et al.
Published: (2023)

MultiContrievers: Analysis of Dense Retrieval Representations
by: Goldfarb-Tarrant, Seraphina, et al.
Published: (2024)

Ask LLMs Directly, "What shapes your bias?": Measuring Social Bias in Large Language Models
by: Shin, Jisu, et al.
Published: (2024)

Spotting Out-of-Character Behavior: Atomic-Level Evaluation of Persona Fidelity in Open-Ended Generation
by: Shin, Jisu, et al.
Published: (2025)

Semiparametric Token-Sequence Co-Supervision
by: Lee, Hyunji, et al.
Published: (2024)

Latent Terms: Dense Retrievers Contain Trivially Extractable BM25-ready Zipfian Vocabularies
by: Clavié, Benjamin, et al.
Published: (2026)

Safeguarding RAG Pipelines with GMTP: A Gradient-based Masked Token Probability Method for Poisoned Document Detection
by: Kim, San, et al.
Published: (2025)

Training for Compositional Sensitivity Reduces Dense Retrieval Generalization
by: Ralev, Radoslav, et al.
Published: (2026)

MoR: Better Handling Diverse Queries with a Mixture of Sparse, Dense, and Human Retrievers
by: Kalra, Jushaan Singh, et al.
Published: (2025)

Memory Retrieval and Consolidation in Large Language Models through Function Tokens
by: Zhang, Shaohua, et al.
Published: (2025)

BTR: Binary Token Representations for Efficient Retrieval Augmented Language Models
by: Cao, Qingqing, et al.
Published: (2023)

SERC: LDPC-Inspired Semantic Error Correction for Retrieval-Augmented Generation
by: Kim, Gyumin, et al.
Published: (2026)