:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Yang, Zi, Hua, Nan
Format:	Preprint
Published:	2024
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2401.04881
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Equipping Transformer with Random-Access Reading for Long-Context Understanding
by: Yang, Chenghao, et al.
Published: (2024)

Query-focused and Memory-aware Reranker for Long Context Processing
by: Li, Yuqing, et al.
Published: (2026)

Learning When to Attend: Conditional Memory Access for Long-Context LLMs
by: Choudhary, Sakshi, et al.
Published: (2026)

Retrieval Or Holistic Understanding? Dolce: Differentiate Our Long Context Evaluation Tasks
by: Yang, Zi
Published: (2024)

OCR-Memory: Optical Context Retrieval for Long-Horizon Agent Memory
by: Li, Jinze, et al.
Published: (2026)

Learning to Evict from Key-Value Cache
by: Moschella, Luca, et al.
Published: (2026)

HMT: Hierarchical Memory Transformer for Efficient Long Context Language Processing
by: He, Zifan, et al.
Published: (2024)

MemoRAG: Boosting Long Context Processing with Global Memory-Enhanced Retrieval Augmentation
by: Qian, Hongjin, et al.
Published: (2024)

LongEmbed: Extending Embedding Models for Long Context Retrieval
by: Zhu, Dawei, et al.
Published: (2024)

Query-Focused Retrieval Heads Improve Long-Context Reasoning and Re-ranking
by: Zhang, Wuwei, et al.
Published: (2025)

EMS: Adaptive Evict-then-Merge Strategy for Head-wise KV Cache Compression Based on Global-Local Importance
by: Li, Yingxin, et al.
Published: (2024)

OkraLong: A Flexible Retrieval-Augmented Framework for Long-Text Query Processing
by: Hui, Yulong, et al.
Published: (2025)

Infinite Retrieval: Attention Enhanced LLMs in Long-Context Processing
by: Ye, Xiaoju, et al.
Published: (2025)

WhenLoss: Diagnosing Write and Retrieval Bottlenecks in Long-Context Memory Systems
by: Yu, Jiangnan, et al.
Published: (2026)

HTAM: Hierarchical Transition-Attended Memory for Operator Optimization
by: Zhang, Yining, et al.
Published: (2026)

HingeMem: Boundary Guided Long-Term Memory with Query Adaptive Retrieval for Scalable Dialogues
by: Zhong, Yijie, et al.
Published: (2026)

Activation-aware Probe-Query: Effective Key-Value Retrieval for Long-Context LLMs Inference
by: Xiao, Qingfa, et al.
Published: (2025)

Learning to Retrieve In-Context Examples for Large Language Models
by: Wang, Liang, et al.
Published: (2023)

Diagonal Batching Unlocks Parallelism in Recurrent Memory Transformers for Long Contexts
by: Sivtsov, Danil, et al.
Published: (2025)

Compress, Gather, and Recompute: REFORMing Long-Context Processing in Transformers
by: Song, Woomin, et al.
Published: (2025)

S$^3$-Attention:Attention-Aligned Endogenous Retrieval for Memory-Bounded Long-Context Inference
by: Ma, Qingsen, et al.
Published: (2026)

Query Suggestion for Retrieval-Augmented Generation via Dynamic In-Context Learning
by: Spaeh, Fabian, et al.
Published: (2026)

Improving Retrieval in Sponsored Search by Leveraging Query Context Signals
by: Mohankumar, Akash Kumar, et al.
Published: (2024)

Evaluating Long-Term Memory for Long-Context Question Answering
by: Terranova, Alessandra, et al.
Published: (2025)

QUITO: Accelerating Long-Context Reasoning through Query-Guided Context Compression
by: Wang, Wenshan, et al.
Published: (2024)

Decomposing Queries into Tool Calls for Long-Video Keyframe Retrieval
by: Shlapentokh-Rothman, Michal, et al.
Published: (2026)

RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval
by: Liu, Di, et al.
Published: (2024)

Unstructured Evidence Attribution for Long Context Query Focused Summarization
by: Wright, Dustin, et al.
Published: (2025)

XC-Cache: Cross-Attending to Cached Context for Efficient LLM Inference
by: Monteiro, João, et al.
Published: (2024)

Retrieval Head Mechanistically Explains Long-Context Factuality
by: Wu, Wenhao, et al.
Published: (2024)

Inference Scaling for Long-Context Retrieval Augmented Generation
by: Yue, Zhenrui, et al.
Published: (2024)

Evaluating Multilingual Long-Context Models for Retrieval and Reasoning
by: Agrawal, Ameeta, et al.
Published: (2024)

MemLong: Memory-Augmented Retrieval for Long Text Modeling
by: Liu, Weijie, et al.
Published: (2024)

Chunk, Align, Select: A Simple Long-sequence Processing Method for Transformers
by: Xie, Jiawen, et al.
Published: (2023)

Enhancing Retrieval Processes for Language Generation with Augmented Queries
by: Ghali, Julien Pierre Edmond, et al.
Published: (2024)

DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads
by: Xiao, Guangxuan, et al.
Published: (2024)

Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference
by: Tang, Jiaming, et al.
Published: (2024)

Inducing Systematicity in Transformers by Attending to Structurally Quantized Embeddings
by: Jiang, Yichen, et al.
Published: (2024)

Gated Differentiable Working Memory for Long-Context Language Modeling
by: Mei, Lingrui, et al.
Published: (2026)

Literary Evidence Retrieval via Long-Context Language Models
by: Thai, Katherine, et al.
Published: (2025)