Saved in:
| Main Authors: | Yang, Zi, Hua, Nan |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2401.04881 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Equipping Transformer with Random-Access Reading for Long-Context Understanding
by: Yang, Chenghao, et al.
Published: (2024)
by: Yang, Chenghao, et al.
Published: (2024)
Query-focused and Memory-aware Reranker for Long Context Processing
by: Li, Yuqing, et al.
Published: (2026)
by: Li, Yuqing, et al.
Published: (2026)
Learning When to Attend: Conditional Memory Access for Long-Context LLMs
by: Choudhary, Sakshi, et al.
Published: (2026)
by: Choudhary, Sakshi, et al.
Published: (2026)
Retrieval Or Holistic Understanding? Dolce: Differentiate Our Long Context Evaluation Tasks
by: Yang, Zi
Published: (2024)
by: Yang, Zi
Published: (2024)
OCR-Memory: Optical Context Retrieval for Long-Horizon Agent Memory
by: Li, Jinze, et al.
Published: (2026)
by: Li, Jinze, et al.
Published: (2026)
Learning to Evict from Key-Value Cache
by: Moschella, Luca, et al.
Published: (2026)
by: Moschella, Luca, et al.
Published: (2026)
HMT: Hierarchical Memory Transformer for Efficient Long Context Language Processing
by: He, Zifan, et al.
Published: (2024)
by: He, Zifan, et al.
Published: (2024)
MemoRAG: Boosting Long Context Processing with Global Memory-Enhanced Retrieval Augmentation
by: Qian, Hongjin, et al.
Published: (2024)
by: Qian, Hongjin, et al.
Published: (2024)
LongEmbed: Extending Embedding Models for Long Context Retrieval
by: Zhu, Dawei, et al.
Published: (2024)
by: Zhu, Dawei, et al.
Published: (2024)
Query-Focused Retrieval Heads Improve Long-Context Reasoning and Re-ranking
by: Zhang, Wuwei, et al.
Published: (2025)
by: Zhang, Wuwei, et al.
Published: (2025)
EMS: Adaptive Evict-then-Merge Strategy for Head-wise KV Cache Compression Based on Global-Local Importance
by: Li, Yingxin, et al.
Published: (2024)
by: Li, Yingxin, et al.
Published: (2024)
OkraLong: A Flexible Retrieval-Augmented Framework for Long-Text Query Processing
by: Hui, Yulong, et al.
Published: (2025)
by: Hui, Yulong, et al.
Published: (2025)
Infinite Retrieval: Attention Enhanced LLMs in Long-Context Processing
by: Ye, Xiaoju, et al.
Published: (2025)
by: Ye, Xiaoju, et al.
Published: (2025)
WhenLoss: Diagnosing Write and Retrieval Bottlenecks in Long-Context Memory Systems
by: Yu, Jiangnan, et al.
Published: (2026)
by: Yu, Jiangnan, et al.
Published: (2026)
HTAM: Hierarchical Transition-Attended Memory for Operator Optimization
by: Zhang, Yining, et al.
Published: (2026)
by: Zhang, Yining, et al.
Published: (2026)
HingeMem: Boundary Guided Long-Term Memory with Query Adaptive Retrieval for Scalable Dialogues
by: Zhong, Yijie, et al.
Published: (2026)
by: Zhong, Yijie, et al.
Published: (2026)
Activation-aware Probe-Query: Effective Key-Value Retrieval for Long-Context LLMs Inference
by: Xiao, Qingfa, et al.
Published: (2025)
by: Xiao, Qingfa, et al.
Published: (2025)
Learning to Retrieve In-Context Examples for Large Language Models
by: Wang, Liang, et al.
Published: (2023)
by: Wang, Liang, et al.
Published: (2023)
Diagonal Batching Unlocks Parallelism in Recurrent Memory Transformers for Long Contexts
by: Sivtsov, Danil, et al.
Published: (2025)
by: Sivtsov, Danil, et al.
Published: (2025)
Compress, Gather, and Recompute: REFORMing Long-Context Processing in Transformers
by: Song, Woomin, et al.
Published: (2025)
by: Song, Woomin, et al.
Published: (2025)
S$^3$-Attention:Attention-Aligned Endogenous Retrieval for Memory-Bounded Long-Context Inference
by: Ma, Qingsen, et al.
Published: (2026)
by: Ma, Qingsen, et al.
Published: (2026)
Query Suggestion for Retrieval-Augmented Generation via Dynamic In-Context Learning
by: Spaeh, Fabian, et al.
Published: (2026)
by: Spaeh, Fabian, et al.
Published: (2026)
Improving Retrieval in Sponsored Search by Leveraging Query Context Signals
by: Mohankumar, Akash Kumar, et al.
Published: (2024)
by: Mohankumar, Akash Kumar, et al.
Published: (2024)
Evaluating Long-Term Memory for Long-Context Question Answering
by: Terranova, Alessandra, et al.
Published: (2025)
by: Terranova, Alessandra, et al.
Published: (2025)
QUITO: Accelerating Long-Context Reasoning through Query-Guided Context Compression
by: Wang, Wenshan, et al.
Published: (2024)
by: Wang, Wenshan, et al.
Published: (2024)
Decomposing Queries into Tool Calls for Long-Video Keyframe Retrieval
by: Shlapentokh-Rothman, Michal, et al.
Published: (2026)
by: Shlapentokh-Rothman, Michal, et al.
Published: (2026)
RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval
by: Liu, Di, et al.
Published: (2024)
by: Liu, Di, et al.
Published: (2024)
Unstructured Evidence Attribution for Long Context Query Focused Summarization
by: Wright, Dustin, et al.
Published: (2025)
by: Wright, Dustin, et al.
Published: (2025)
XC-Cache: Cross-Attending to Cached Context for Efficient LLM Inference
by: Monteiro, João, et al.
Published: (2024)
by: Monteiro, João, et al.
Published: (2024)
Retrieval Head Mechanistically Explains Long-Context Factuality
by: Wu, Wenhao, et al.
Published: (2024)
by: Wu, Wenhao, et al.
Published: (2024)
Inference Scaling for Long-Context Retrieval Augmented Generation
by: Yue, Zhenrui, et al.
Published: (2024)
by: Yue, Zhenrui, et al.
Published: (2024)
Evaluating Multilingual Long-Context Models for Retrieval and Reasoning
by: Agrawal, Ameeta, et al.
Published: (2024)
by: Agrawal, Ameeta, et al.
Published: (2024)
MemLong: Memory-Augmented Retrieval for Long Text Modeling
by: Liu, Weijie, et al.
Published: (2024)
by: Liu, Weijie, et al.
Published: (2024)
Chunk, Align, Select: A Simple Long-sequence Processing Method for Transformers
by: Xie, Jiawen, et al.
Published: (2023)
by: Xie, Jiawen, et al.
Published: (2023)
Enhancing Retrieval Processes for Language Generation with Augmented Queries
by: Ghali, Julien Pierre Edmond, et al.
Published: (2024)
by: Ghali, Julien Pierre Edmond, et al.
Published: (2024)
DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads
by: Xiao, Guangxuan, et al.
Published: (2024)
by: Xiao, Guangxuan, et al.
Published: (2024)
Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference
by: Tang, Jiaming, et al.
Published: (2024)
by: Tang, Jiaming, et al.
Published: (2024)
Inducing Systematicity in Transformers by Attending to Structurally Quantized Embeddings
by: Jiang, Yichen, et al.
Published: (2024)
by: Jiang, Yichen, et al.
Published: (2024)
Gated Differentiable Working Memory for Long-Context Language Modeling
by: Mei, Lingrui, et al.
Published: (2026)
by: Mei, Lingrui, et al.
Published: (2026)
Literary Evidence Retrieval via Long-Context Language Models
by: Thai, Katherine, et al.
Published: (2025)
by: Thai, Katherine, et al.
Published: (2025)
Similar Items
-
Equipping Transformer with Random-Access Reading for Long-Context Understanding
by: Yang, Chenghao, et al.
Published: (2024) -
Query-focused and Memory-aware Reranker for Long Context Processing
by: Li, Yuqing, et al.
Published: (2026) -
Learning When to Attend: Conditional Memory Access for Long-Context LLMs
by: Choudhary, Sakshi, et al.
Published: (2026) -
Retrieval Or Holistic Understanding? Dolce: Differentiate Our Long Context Evaluation Tasks
by: Yang, Zi
Published: (2024) -
OCR-Memory: Optical Context Retrieval for Long-Horizon Agent Memory
by: Li, Jinze, et al.
Published: (2026)