:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Wang, Weixuan, Wu, Minghao, Haddow, Barry, Birch, Alexandra
Format:	Preprint
Published:	2025
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2505.12313
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Demystifying Multilingual Chain-of-Thought in Process Reward Modeling
by: Wang, Weixuan, et al.
Published: (2025)

HBO: Hierarchical Balancing Optimization for Fine-Tuning Large Language Models
by: Wang, Weixuan, et al.
Published: (2025)

Learning to Summarize by Learning to Quiz: Adversarial Agentic Collaboration for Long Document Summarization
by: Wang, Weixuan, et al.
Published: (2025)

Bridging the Language Gaps in Large Language Models with Inference-Time Cross-Lingual Intervention
by: Wang, Weixuan, et al.
Published: (2024)

Sharing Matters: Analysing Neurons Across Languages and Tasks in LLMs
by: Wang, Weixuan, et al.
Published: (2024)

Multilingual Retrieval-Augmented Generation for Knowledge-Intensive Task
by: Ranaldi, Leonardo, et al.
Published: (2025)

MGen: Millions of Naturally Occurring Generics in Context
by: Cilleruelo, Gustavo, et al.
Published: (2025)

When Does Monolingual Data Help Multilingual Translation: The Role of Domain and Model Scale
by: Baziotis, Christos, et al.
Published: (2023)

The Ups and Downs of Large Language Model Inference with Vocabulary Trimming by Language Heuristics
by: Bogoychev, Nikolay, et al.
Published: (2023)

Compact Speech Translation Models via Discrete Speech Units Pretraining
by: Lam, Tsz Kin, et al.
Published: (2024)

Improving Multilingual Retrieval-Augmented Language Models through Dialectic Reasoning Argumentations
by: Ranaldi, Leonardo, et al.
Published: (2025)

The Prosody of Emojis
by: Zhou, Giulio, et al.
Published: (2025)

Prosody in Cascade and Direct Speech-to-Text Translation: a case study on Korean Wh-Phrases
by: Zhou, Giulio, et al.
Published: (2024)

Liaozhai through the Looking-Glass: On Paratextual Explicitation of Culture-Bound Terms in Machine Translation
by: Shen, Sherrie, et al.
Published: (2025)

Generics are puzzling. Can language models find the missing piece?
by: Calderón, Gustavo Cilleruelo, et al.
Published: (2024)

Quality or Quantity? On Data Scale and Diversity in Adapting Large Language Models for Low-Resource Translation
by: Iyer, Vivek, et al.
Published: (2024)

Understanding Multilingualism in Mixture-of-Experts LLMs: Routing Mechanism, Expert Specialization, and Layerwise Steering
by: Chen, Yuxin, et al.
Published: (2026)

Understanding and Leveraging the Expert Specialization of Context Faithfulness in Mixture-of-Experts LLMs
by: Bai, Jun, et al.
Published: (2025)

Semantics-Adaptive Activation Intervention for LLMs via Dynamic Steering Vectors
by: Wang, Weixuan, et al.
Published: (2024)

Steering MoE LLMs via Expert (De)Activation
by: Fayyaz, Mohsen, et al.
Published: (2025)

Is Modularity Transferable? A Case Study through the Lens of Knowledge Distillation
by: Klimaszewski, Mateusz, et al.
Published: (2024)

From Beginner to Expert: Modeling Medical Knowledge into General LLMs
by: Li, Qiang, et al.
Published: (2023)

An Expert is Worth One Token: Synergizing Multiple Expert LLMs as Generalist via Expert Token Routing
by: Chai, Ziwei, et al.
Published: (2024)

Context and System Fusion in Post-ASR Emotion Recognition with Large Language Models
by: Stepachev, Pavel, et al.
Published: (2024)

Dropping Experts, Recombining Neurons: Retraining-Free Pruning for Sparse Mixture-of-Experts LLMs
by: Zhou, Yixiao, et al.
Published: (2025)

Integrating Expert Knowledge into Logical Programs via LLMs
by: Górski, Franciszek, et al.
Published: (2025)

LF-Steering: Latent Feature Activation Steering for Enhancing Semantic Consistency in Large Language Models
by: Yang, Jingyuan, et al.
Published: (2025)

Is It Good Data for Multilingual Instruction Tuning or Just Bad Multilingual Evaluation for Large Language Models?
by: Chen, Pinzhen, et al.
Published: (2024)

Iterative Translation Refinement with Large Language Models
by: Chen, Pinzhen, et al.
Published: (2023)

Mixture of Experts for Low-Resource LLMs
by: Joseph, Ori Bar, et al.
Published: (2026)

Diversifying the Expert Knowledge for Task-Agnostic Pruning in Sparse Mixture-of-Experts
by: Zhang, Zeliang, et al.
Published: (2024)

SEUF: Is Unlearning One Expert Enough for Mixture-of-Experts LLMs?
by: Zhuang, Haomin, et al.
Published: (2024)

DiEP: Adaptive Mixture-of-Experts Compression through Differentiable Expert Pruning
by: Bai, Sikai, et al.
Published: (2025)

Continuously Steering LLMs Sensitivity to Contextual Knowledge with Proxy Models
by: Wang, Yilin, et al.
Published: (2025)

Test-Time Steering for Lossless Text Compression via Weighted Product of Experts
by: Zhang, Qihang, et al.
Published: (2025)

Knowledge Localization in Mixture-of-Experts LLMs Using Cross-Lingual Inconsistency
by: Bandarkar, Lucas, et al.
Published: (2026)

MatheMagic: Generating Dynamic Mathematics Benchmarks Robust to Memorization
by: O'Brien, Dayyán, et al.
Published: (2025)

Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM
by: Sukhbaatar, Sainbayar, et al.
Published: (2024)

dMoE: dLLMs with Learnable Block Experts
by: Feng, Sicheng, et al.
Published: (2026)

CartesianMoE: Boosting Knowledge Sharing among Experts via Cartesian Product Routing in Mixture-of-Experts
by: Su, Zhenpeng, et al.
Published: (2024)