Saved in:
| Main Authors: | Mazaré, Pierre-Emmanuel, Szilvasy, Gergely, Lomeli, Maria, Massa, Francisco, Murray, Naila, Jégou, Hervé, Douze, Matthijs |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.08246 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Self-Pruned Key-Value Attention: Learning When to Write by Predicting Future Utility
by: Szilvasy, Gergely, et al.
Published: (2026)
by: Szilvasy, Gergely, et al.
Published: (2026)
Vector search with small radiuses
by: Szilvasy, Gergely, et al.
Published: (2024)
by: Szilvasy, Gergely, et al.
Published: (2024)
Short window attention enables long-term memorization
by: Cabannes, Loïc, et al.
Published: (2025)
by: Cabannes, Loïc, et al.
Published: (2025)
The Faiss library
by: Douze, Matthijs, et al.
Published: (2024)
by: Douze, Matthijs, et al.
Published: (2024)
Stochastic activations
by: Lomeli, Maria, et al.
Published: (2025)
by: Lomeli, Maria, et al.
Published: (2025)
Functional Invariants to Watermark Large Transformers
by: Fernandez, Pierre, et al.
Published: (2023)
by: Fernandez, Pierre, et al.
Published: (2023)
Evaluation data contamination in LLMs: how do we measure it and (when) does it matter?
by: Singh, Aaditya K., et al.
Published: (2024)
by: Singh, Aaditya K., et al.
Published: (2024)
Moshi: a speech-text foundation model for real-time dialogue
by: Défossez, Alexandre, et al.
Published: (2024)
by: Défossez, Alexandre, et al.
Published: (2024)
Watermarking Makes Language Models Radioactive
by: Sander, Tom, et al.
Published: (2024)
by: Sander, Tom, et al.
Published: (2024)
RA-DIT: Retrieval-Augmented Dual Instruction Tuning
by: Lin, Xi Victoria, et al.
Published: (2023)
by: Lin, Xi Victoria, et al.
Published: (2023)
Machine learning and high dimensional vector search
by: Douze, Matthijs
Published: (2025)
by: Douze, Matthijs
Published: (2025)
Neutral Residues: Revisiting Adapters for Model Extension
by: Talla, Franck Signe, et al.
Published: (2024)
by: Talla, Franck Signe, et al.
Published: (2024)
In-context Pretraining: Language Modeling Beyond Document Boundaries
by: Shi, Weijia, et al.
Published: (2023)
by: Shi, Weijia, et al.
Published: (2023)
Watermark Anything with Localized Messages
by: Sander, Tom, et al.
Published: (2024)
by: Sander, Tom, et al.
Published: (2024)
KVzap: Fast, Adaptive, and Faithful KV Cache Pruning
by: Jegou, Simon, et al.
Published: (2026)
by: Jegou, Simon, et al.
Published: (2026)
MagicPIG: LSH Sampling for Efficient LLM Generation
by: Chen, Zhuoming, et al.
Published: (2024)
by: Chen, Zhuoming, et al.
Published: (2024)
Expected Attention: KV Cache Compression by Estimating Attention from Future Queries Distribution
by: Devoto, Alessio, et al.
Published: (2025)
by: Devoto, Alessio, et al.
Published: (2025)
Verifying Chain-of-Thought Reasoning via Its Computational Graph
by: Zhao, Zheng, et al.
Published: (2025)
by: Zhao, Zheng, et al.
Published: (2025)
Aligning Spoken Dialogue Models from User Interactions
by: Wu, Anne, et al.
Published: (2025)
by: Wu, Anne, et al.
Published: (2025)
High-Fidelity Simultaneous Speech-To-Speech Translation
by: Labiausse, Tom, et al.
Published: (2025)
by: Labiausse, Tom, et al.
Published: (2025)
Syntax and Semantics of Linear Dependent Types
by: Vákár, Matthijs
Published: (2014)
by: Vákár, Matthijs
Published: (2014)
Multi-head attention debiasing and contrastive learning for mitigating Dataset Artifacts in Natural Language Inference
by: Sivakoti, Karthik
Published: (2024)
by: Sivakoti, Karthik
Published: (2024)
DeFT: Decoding with Flash Tree-attention for Efficient Tree-structured LLM Inference
by: Yao, Jinwei, et al.
Published: (2024)
by: Yao, Jinwei, et al.
Published: (2024)
CHAD: Combinatory Homomorphic Automatic Differentiation
by: Vákár, Matthijs, et al.
Published: (2021)
by: Vákár, Matthijs, et al.
Published: (2021)
DELULU: Discriminative Embedding Learning Using Latent Units for Speaker-Aware Self-Trained Speech Foundational Model
by: Baali, Massa, et al.
Published: (2025)
by: Baali, Massa, et al.
Published: (2025)
SVeritas: Benchmark for Robust Speaker Verification under Diverse Conditions
by: Baali, Massa, et al.
Published: (2025)
by: Baali, Massa, et al.
Published: (2025)
Self-attention vector output similarities reveal how machines pay attention
by: Halevi, Tal, et al.
Published: (2025)
by: Halevi, Tal, et al.
Published: (2025)
Streaming Sequence-to-Sequence Learning with Delayed Streams Modeling
by: Zeghidour, Neil, et al.
Published: (2025)
by: Zeghidour, Neil, et al.
Published: (2025)
Simulating Training Data Leakage in Multiple-Choice Benchmarks for LLM Evaluation
by: Hidayat, Naila Shafirni, et al.
Published: (2025)
by: Hidayat, Naila Shafirni, et al.
Published: (2025)
Illuminating Blind Spots of Language Models with Targeted Agent-in-the-Loop Synthetic Data
by: Lippmann, Philip, et al.
Published: (2024)
by: Lippmann, Philip, et al.
Published: (2024)
TOOLVERIFIER: Generalization to New Tools via Self-Verification
by: Mekala, Dheeraj, et al.
Published: (2024)
by: Mekala, Dheeraj, et al.
Published: (2024)
Let your LLM generate a few tokens and you will reduce the need for retrieval
by: Déjean, Hervé
Published: (2024)
by: Déjean, Hervé
Published: (2024)
What and When to Learn: CURriculum Ranking Loss for Large-Scale Speaker Verification
by: Baali, Massa, et al.
Published: (2026)
by: Baali, Massa, et al.
Published: (2026)
Higher Order Automatic Differentiation of Higher Order Functions
by: Huot, Mathieu, et al.
Published: (2021)
by: Huot, Mathieu, et al.
Published: (2021)
Winning Amazon KDD Cup'24
by: Deotte, Chris, et al.
Published: (2024)
by: Deotte, Chris, et al.
Published: (2024)
Truthful Text Sanitization Guided by Inference Attacks
by: Pilán, Ildikó, et al.
Published: (2024)
by: Pilán, Ildikó, et al.
Published: (2024)
Qinco2: Vector Compression and Search with Improved Implicit Neural Codebooks
by: Vallaeys, Théophane, et al.
Published: (2025)
by: Vallaeys, Théophane, et al.
Published: (2025)
Visualizing attention zones in machine reading comprehension models
by: Cui, Yiming, et al.
Published: (2024)
by: Cui, Yiming, et al.
Published: (2024)
Sentiment analysis with adaptive multi-head attention in Transformer
by: Meng, Fanfei, et al.
Published: (2023)
by: Meng, Fanfei, et al.
Published: (2023)
Relational inductive biases on attention mechanisms
by: Mijangos, Víctor, et al.
Published: (2025)
by: Mijangos, Víctor, et al.
Published: (2025)
Similar Items
-
Self-Pruned Key-Value Attention: Learning When to Write by Predicting Future Utility
by: Szilvasy, Gergely, et al.
Published: (2026) -
Vector search with small radiuses
by: Szilvasy, Gergely, et al.
Published: (2024) -
Short window attention enables long-term memorization
by: Cabannes, Loïc, et al.
Published: (2025) -
The Faiss library
by: Douze, Matthijs, et al.
Published: (2024) -
Stochastic activations
by: Lomeli, Maria, et al.
Published: (2025)