:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	LCM team, Barrault, Loïc, Duquenne, Paul-Ambroise, Elbayad, Maha, Kozhevnikov, Artyom, Alastruey, Belen, Andrews, Pierre, Coria, Mariano, Couairon, Guillaume, Costa-jussà, Marta R., Dale, David, Elsahar, Hady, Heffernan, Kevin, Janeiro, João Maria, Tran, Tuan, Ropers, Christophe, Sánchez, Eduardo, Roman, Robin San, Mourachko, Alexandre, Saleem, Safiyyah, Schwenk, Holger
Format:	Preprint
Published:	2024
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2412.08821
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Interference Matrix: Quantifying Cross-Lingual Interference in Transformer Encoders
by: Alastruey, Belen, et al.
Published: (2025)

Omnilingual SONAR: Cross-Lingual and Cross-Modal Sentence Embeddings Bridging Massively Multilingual Text and Speech
by: Omnilingual SONAR Team, et al.
Published: (2026)

Unified Vision-Language Modeling via Concept Space Alignment
by: Qiu, Yifu, et al.
Published: (2026)

Video Seal: Open and Efficient Video Watermarking
by: Fernandez, Pierre, et al.
Published: (2024)

Spirit LM: Interleaved Spoken and Written Language Model
by: Nguyen, Tu Anh, et al.
Published: (2024)

Merging Text Transformer Models from Different Initializations
by: Verma, Neha, et al.
Published: (2024)

MEXMA: Token-level objectives improve sentence representations
by: Janeiro, João Maria, et al.
Published: (2024)

Transferable Black-Box One-Shot Forging of Watermarks via Image Preference Models
by: Souček, Tomáš, et al.
Published: (2025)

Omnilingual MT: Machine Translation for 1,600 Languages
by: Omnilingual MT Team, et al.
Published: (2026)

Linguini: A benchmark for language-agnostic linguistic reasoning
by: Sánchez, Eduardo, et al.
Published: (2024)

MuTox: Universal MUltilingual Audio-based TOXicity Dataset and Zero-shot Detector
by: Costa-jussà, Marta R., et al.
Published: (2024)

Learning to Watermark in the Latent Space of Generative Models
by: Rebuffi, Sylvestre-Alvise, et al.
Published: (2026)

How Good is Post-Hoc Watermarking With Language Model Rephrasing?
by: Fernandez, Pierre, et al.
Published: (2025)

Pixel Seal: Adversarial-only training for invisible image and video watermarking
by: Souček, Tomáš, et al.
Published: (2025)

Geometric Image Synchronization with Deep Watermarking
by: Fernandez, Pierre, et al.
Published: (2025)

LCFO: Long Context and Long Form Output Dataset and Benchmarking
by: Costa-jussà, Marta R., et al.
Published: (2024)

BOUQuET: dataset, Benchmark and Open initiative for Universal Quality Evaluation in Translation
by: The Omnilingual MT Team, et al.
Published: (2025)

TextSeal: A Localized LLM Watermark for Provenance & Distillation Protection
by: Sander, Tom, et al.
Published: (2026)

We Can Hide More Bits: The Unused Watermarking Capacity in Theory and in Practice
by: Petrov, Aleksandar, et al.
Published: (2025)

Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages
by: Omnilingual ASR team, et al.
Published: (2025)

Unveiling the Role of Pretraining in Direct Speech Translation
by: Alastruey, Belen, et al.
Published: (2024)

2M-BELEBELE: Highly Multilingual Speech and American Sign Language Comprehension Dataset
by: Costa-jussà, Marta R., et al.
Published: (2024)

VUGEN: Visual Understanding priors for GENeration
by: Chen, Xiangyi, et al.
Published: (2025)

Text-Guided Semantic Image Encoder
by: Thirukovalluru, Raghuveer, et al.
Published: (2025)

SpeechAlign: a Framework for Speech Translation Alignment Evaluation
by: Alastruey, Belen, et al.
Published: (2023)

We Need to Talk About Classification Evaluation Metrics in NLP
by: Vickers, Peter, et al.
Published: (2024)

Proactive Detection of Voice Cloning with Localized Watermarking
by: Roman, Robin San, et al.
Published: (2024)

On the Role of Speech Data in Reducing Toxicity Detection Bias
by: Bell, Samuel J., et al.
Published: (2024)

Consideraciones sobre el Trabajo Comunitario desde la perspectiva de equipos estatales y ONG
by: Omar Barrault
Published: (2015)

FreeSeg-Diff: Training-Free Open-Vocabulary Segmentation with Diffusion Models
by: Corradini, Barbara Toniella, et al.
Published: (2024)

Neural Wikipedian: Generating Textual Summaries from Knowledge Base Triples
by: Vougiouklis, Pavlos, et al.
Published: (2017)

Y-NQ: English-Yorùbá Evaluation dataset for Open-Book Reading Comprehension and Text Generation
by: Costa-jussà, Marta R., et al.
Published: (2024)

Towards Massive Multilingual Holistic Bias
by: Tan, Xiaoqing Ellen, et al.
Published: (2024)

SUBMICROSECOND ATMOSPHERIC ELECTRIC DISCHARGE FROM THE NON-UNIFORM ELECTRODE (TIP) TOWARDS THE PLANE ELECTRODE
by: Vasily Y. Kozhevnikov
Published: (2019)

Physical nature of 'anomalous' electrons in high-current vacuum diodes
by: Vasily Y. Kozhevnikov
Published: (2021)

Measuring the precise photometric period of the probable intermediate polar 1RXS J014549.6+514314 based on extensive photometry
by: Kozhevnikov, V. P.
Published: (2025)

Kinetic simulation of vacuum plasma expansion beyond the "plasma approximation"
by: Vasily Y. Kozhevnikov
Published: (2022)

Discovery of eclipses in the cataclysmic variable LAMOST J035913.61+405035.0
by: Kozhevnikov, V. P.
Published: (2024)

Detection of Eclipses in the Cataclysmic Variable LAMOST J035913.61 + 405035.0
by: V. P. Kozhevnikov
Published: (2025)

Measuring the Precise Photometric Period of the Probable Intermediate Polar 1RXS J014549.6+514314 Based on Extensive Photometry
by: V. P. Kozhevnikov
Published: (2025)