:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Chen, Zeming, Romanou, Angelika, Weiss, Gail, Bosselut, Antoine
Format:	Preprint
Published:	2025
Subjects:	Computation and Language Machine Learning
Online Access:	https://arxiv.org/abs/2507.06415
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Discovering Knowledge-Critical Subnetworks in Pretrained Language Models
by: Bayazit, Deniz, et al.
Published: (2023)

Reliable Evaluation and Benchmarks for Statement Autoformalization
by: Poiroux, Auguste, et al.
Published: (2024)

LLMs Are In-Context Bandit Reinforcement Learners
by: Monea, Giovanni, et al.
Published: (2024)

GaLLoP: Gradient-based Sparse Learning on Low-Magnitude Parameters
by: Choudhary, Anand, et al.
Published: (2025)

Crosscoding Through Time: Tracking Emergence & Consolidation Of Linguistic Representations Throughout LLM Pretraining
by: Bayazit, Deniz, et al.
Published: (2025)

CAVE: Detecting and Explaining Commonsense Anomalies in Visual Environments
by: Bhagwatkar, Rishika, et al.
Published: (2025)

Complex Reasoning over Logical Queries on Commonsense Knowledge Graphs
by: Fang, Tianqing, et al.
Published: (2024)

WikiMixQA: A Multimodal Benchmark for Question Answering over Tables and Charts
by: Foroutan, Negar, et al.
Published: (2025)

MILE: A Mutation Testing Framework of In-Context Learning Systems
by: Wei, Zeming, et al.
Published: (2024)

Multipole Attention for Efficient Long Context Reasoning
by: Hooper, Coleman, et al.
Published: (2025)

ConLID: Supervised Contrastive Learning for Low-Resource Language Identification
by: Foroutan, Negar, et al.
Published: (2025)

The LLM Language Network: A Neuroscientific Approach for Identifying Causally Task-Relevant Units
by: AlKhamissi, Badr, et al.
Published: (2024)

Brain-Like Language Processing via a Shallow Untrained Multihead Attention Network
by: AlKhamissi, Badr, et al.
Published: (2024)

Scaling Context, Not Parameters: Training a Compact 7B Language Model for Efficient Long-Context Processing
by: Wu, Chen, et al.
Published: (2025)

Let's (not) just put things in Context: Test-Time Training for Long-Context LLMs
by: Bansal, Rachit, et al.
Published: (2025)

Helpful to a Fault: Measuring Illicit Assistance in Multi-Turn, Multilingual LLM Agents
by: Talokar, Nivya, et al.
Published: (2026)

Revisiting Multilingual Data Mixtures in Language Model Pretraining
by: Foroutan, Negar, et al.
Published: (2025)

Tracking the Limits of Knowledge Propagation: How LLMs Fail at Multi-Step Reasoning with Conflicting Knowledge
by: Feng, Yiyang, et al.
Published: (2026)

DELTA: Dynamic Layer-Aware Token Attention for Efficient Long-Context Reasoning
by: Zarch, Hossein Entezari, et al.
Published: (2025)

Every Attention Matters: An Efficient Hybrid Architecture for Long-Context Reasoning
by: Ling Team, et al.
Published: (2025)

CART: Context-Anchored Recurrent Transformer -- A Parameter-Efficient Architecture with Learned Stability
by: Capps, Chad A.
Published: (2026)

Can Performant LLMs Be Ethical? Quantifying the Impact of Web Crawling Opt-Outs
by: Fan, Dongyang, et al.
Published: (2025)

A Logical Fallacy-Informed Framework for Argument Generation
by: Mouchel, Luca, et al.
Published: (2024)

MLP Fusion: Towards Efficient Fine-tuning of Dense and Mixture-of-Experts Language Models
by: Ai, Mengting, et al.
Published: (2023)

Learning to Reason from Feedback at Test-Time
by: Li, Yanyang, et al.
Published: (2025)

Scalable Defense against In-the-wild Jailbreaking Attacks with Safety Context Retrieval
by: Chen, Taiye, et al.
Published: (2025)

NeuroLoRA: Context-Aware Neuromodulation for Parameter-Efficient Multi-Task Adaptation
by: Yang, Yuxin, et al.
Published: (2026)

LongAct: Harnessing Intrinsic Activation Patterns for Long-Context Reinforcement Learning
by: Ping, Bowen, et al.
Published: (2026)

Decoder-Hybrid-Decoder Architecture for Efficient Reasoning with Long Generation
by: Ren, Liliang, et al.
Published: (2025)

Mixture of Cognitive Reasoners: Modular Reasoning with Brain-Like Specialization
by: AlKhamissi, Badr, et al.
Published: (2025)

Parity-Aware Byte-Pair Encoding: Improving Cross-lingual Fairness in Tokenization
by: Foroutan, Negar, et al.
Published: (2025)

Probe and Skip: Self-Predictive Token Skipping for Efficient Long-Context LLM Inference
by: Wu, Zimeng, et al.
Published: (2026)

Absorber LLM: Harnessing Causal Synchronization for Test-Time Training
by: Zhang, Zhixin, et al.
Published: (2026)

FedMCP: Parameter-Efficient Federated Learning with Model-Contrastive Personalization
by: Zhao, Qianyi, et al.
Published: (2024)

Longer Context, Deeper Thinking: Uncovering the Role of Long-Context Ability in Reasoning
by: Yang, Wang, et al.
Published: (2025)

Train Long, Think Short: Curriculum Learning for Efficient Reasoning
by: Hammoud, Hasan Abed Al Kader, et al.
Published: (2025)

Evaluating Language Model Agency through Negotiations
by: Davidson, Tim R., et al.
Published: (2024)

Rational Metareasoning for Large Language Models
by: De Sabbata, C. Nicolò, et al.
Published: (2024)

Exploring the Robustness of In-Context Learning with Noisy Labels
by: Cheng, Chen, et al.
Published: (2024)

What Formal Languages Can Transformers Express? A Survey
by: Strobl, Lena, et al.
Published: (2023)