Saved in:
| Main Authors: | Lin, Xi Victoria, Chen, Xilun, Chen, Mingda, Shi, Weijia, Lomeli, Maria, James, Rich, Rodriguez, Pedro, Kahn, Jacob, Szilvasy, Gergely, Lewis, Mike, Zettlemoyer, Luke, Yih, Scott |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2310.01352 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
In-context Pretraining: Language Modeling Beyond Document Boundaries
by: Shi, Weijia, et al.
Published: (2023)
by: Shi, Weijia, et al.
Published: (2023)
Few-Shot Data Synthesis for Open Domain Multi-Hop Question Answering
by: Chen, Mingda, et al.
Published: (2023)
by: Chen, Mingda, et al.
Published: (2023)
ImpRAG: Retrieval-Augmented Generation with Implicit Queries
by: Zhang, Wenzheng, et al.
Published: (2025)
by: Zhang, Wenzheng, et al.
Published: (2025)
DRAMA: Diverse Augmentation from Large Language Models to Smaller Dense Retrievers
by: Ma, Xueguang, et al.
Published: (2025)
by: Ma, Xueguang, et al.
Published: (2025)
FACTORY: A Challenging Human-Verified Prompt Set for Long-Form Factuality
by: Chen, Mingda, et al.
Published: (2025)
by: Chen, Mingda, et al.
Published: (2025)
Self-Pruned Key-Value Attention: Learning When to Write by Predicting Future Utility
by: Szilvasy, Gergely, et al.
Published: (2026)
by: Szilvasy, Gergely, et al.
Published: (2026)
Improving Factuality with Explicit Working Memory
by: Chen, Mingda, et al.
Published: (2024)
by: Chen, Mingda, et al.
Published: (2024)
Inference-time sparse attention with asymmetric indexing
by: Mazaré, Pierre-Emmanuel, et al.
Published: (2025)
by: Mazaré, Pierre-Emmanuel, et al.
Published: (2025)
Nearest Neighbor Speculative Decoding for LLM Generation and Attribution
by: Li, Minghan, et al.
Published: (2024)
by: Li, Minghan, et al.
Published: (2024)
Train for Truth, Keep the Skills: Binary Retrieval-Augmented Reward Mitigates Hallucinations
by: Chen, Tong, et al.
Published: (2025)
by: Chen, Tong, et al.
Published: (2025)
Self-Alignment with Instruction Backtranslation
by: Li, Xian, et al.
Published: (2023)
by: Li, Xian, et al.
Published: (2023)
Reliable, Adaptable, and Attributable Language Models with Retrieval
by: Asai, Akari, et al.
Published: (2024)
by: Asai, Akari, et al.
Published: (2024)
Vector search with small radiuses
by: Szilvasy, Gergely, et al.
Published: (2024)
by: Szilvasy, Gergely, et al.
Published: (2024)
Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models
by: Liang, Weixin, et al.
Published: (2024)
by: Liang, Weixin, et al.
Published: (2024)
Evaluation data contamination in LLMs: how do we measure it and (when) does it matter?
by: Singh, Aaditya K., et al.
Published: (2024)
by: Singh, Aaditya K., et al.
Published: (2024)
MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts
by: Lin, Xi Victoria, et al.
Published: (2024)
by: Lin, Xi Victoria, et al.
Published: (2024)
LMFusion: Adapting Pretrained Language Models for Multimodal Generation
by: Shi, Weijia, et al.
Published: (2024)
by: Shi, Weijia, et al.
Published: (2024)
ReasonIR: Training Retrievers for Reasoning Tasks
by: Shao, Rulin, et al.
Published: (2025)
by: Shao, Rulin, et al.
Published: (2025)
Livcornelissen/DIT_mooringdata: Processing and Quality Control Code for DIT Mooring Data
by: Livcornelissen
Published: (2026)
by: Livcornelissen
Published: (2026)
The Faiss library
by: Douze, Matthijs, et al.
Published: (2024)
by: Douze, Matthijs, et al.
Published: (2024)
Short window attention enables long-term memorization
by: Cabannes, Loïc, et al.
Published: (2025)
by: Cabannes, Loïc, et al.
Published: (2025)
Instruction-tuned Language Models are Better Knowledge Learners
by: Jiang, Zhengbao, et al.
Published: (2024)
by: Jiang, Zhengbao, et al.
Published: (2024)
Unraveling and Mitigating Retriever Inconsistencies in Retrieval-Augmented Large Language Models
by: Li, Mingda, et al.
Published: (2024)
by: Li, Mingda, et al.
Published: (2024)
Stochastic activations
by: Lomeli, Maria, et al.
Published: (2025)
by: Lomeli, Maria, et al.
Published: (2025)
Scaling Retrieval-Based Language Models with a Trillion-Token Datastore
by: Shao, Rulin, et al.
Published: (2024)
by: Shao, Rulin, et al.
Published: (2024)
Procedural Knowledge at Scale Improves Reasoning
by: Wu, Di, et al.
Published: (2026)
by: Wu, Di, et al.
Published: (2026)
CID-GraphRAG: Enhancing Multi-Turn Dialogue Systems through Dual-Pathway Retrieval of Conversation Flow and Context Semantics
by: Zhu, Ziqi, et al.
Published: (2025)
by: Zhu, Ziqi, et al.
Published: (2025)
Anchored Decoding: Provably Reducing Copyright Risk for Any Language Model
by: He, Jacqueline, et al.
Published: (2026)
by: He, Jacqueline, et al.
Published: (2026)
Memory Layers at Scale
by: Berges, Vincent-Pierre, et al.
Published: (2024)
by: Berges, Vincent-Pierre, et al.
Published: (2024)
Learning Facts at Scale with Active Reading
by: Lin, Jessy, et al.
Published: (2025)
by: Lin, Jessy, et al.
Published: (2025)
InstructRetro: Instruction Tuning post Retrieval-Augmented Pretraining
by: Wang, Boxin, et al.
Published: (2023)
by: Wang, Boxin, et al.
Published: (2023)
HIRAG: Hierarchical-Thought Instruction-Tuning Retrieval-Augmented Generation
by: Jiao, YiHan, et al.
Published: (2025)
by: Jiao, YiHan, et al.
Published: (2025)
Detecting Pretraining Data from Large Language Models
by: Shi, Weijia, et al.
Published: (2023)
by: Shi, Weijia, et al.
Published: (2023)
JORA: JAX Tensor-Parallel LoRA Library for Retrieval Augmented Fine-Tuning
by: Tahir, Anique, et al.
Published: (2024)
by: Tahir, Anique, et al.
Published: (2024)
SlangDIT: Benchmarking LLMs in Interpretative Slang Translation
by: Liang, Yunlong, et al.
Published: (2025)
by: Liang, Yunlong, et al.
Published: (2025)
MENTAL HEALTH IN MEDICAL STUDENTS AND DOCTORS‐IN‐TRAINING (DIT)
Published: (2024)
Published: (2024)
SMoLoRA: Exploring and Defying Dual Catastrophic Forgetting in Continual Visual Instruction Tuning
by: Wang, Ziqi, et al.
Published: (2024)
by: Wang, Ziqi, et al.
Published: (2024)
RA-LWLM: Retrieval-Augmented In-Context Localization with Wireless Foundation Models
by: Pan, Guangjin, et al.
Published: (2026)
by: Pan, Guangjin, et al.
Published: (2026)
ParaPO: Aligning Language Models to Reduce Verbatim Reproduction of Pre-training Data
by: Chen, Tong, et al.
Published: (2025)
by: Chen, Tong, et al.
Published: (2025)
Retrieval Augmented Instruction Tuning for Open NER with Large Language Models
by: Xie, Tingyu, et al.
Published: (2024)
by: Xie, Tingyu, et al.
Published: (2024)
Similar Items
-
In-context Pretraining: Language Modeling Beyond Document Boundaries
by: Shi, Weijia, et al.
Published: (2023) -
Few-Shot Data Synthesis for Open Domain Multi-Hop Question Answering
by: Chen, Mingda, et al.
Published: (2023) -
ImpRAG: Retrieval-Augmented Generation with Implicit Queries
by: Zhang, Wenzheng, et al.
Published: (2025) -
DRAMA: Diverse Augmentation from Large Language Models to Smaller Dense Retrievers
by: Ma, Xueguang, et al.
Published: (2025) -
FACTORY: A Challenging Human-Verified Prompt Set for Long-Form Factuality
by: Chen, Mingda, et al.
Published: (2025)