Saved in:
| Main Authors: | Chen, Zeming, Romanou, Angelika, Weiss, Gail, Bosselut, Antoine |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2507.06415 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Discovering Knowledge-Critical Subnetworks in Pretrained Language Models
by: Bayazit, Deniz, et al.
Published: (2023)
by: Bayazit, Deniz, et al.
Published: (2023)
Reliable Evaluation and Benchmarks for Statement Autoformalization
by: Poiroux, Auguste, et al.
Published: (2024)
by: Poiroux, Auguste, et al.
Published: (2024)
LLMs Are In-Context Bandit Reinforcement Learners
by: Monea, Giovanni, et al.
Published: (2024)
by: Monea, Giovanni, et al.
Published: (2024)
GaLLoP: Gradient-based Sparse Learning on Low-Magnitude Parameters
by: Choudhary, Anand, et al.
Published: (2025)
by: Choudhary, Anand, et al.
Published: (2025)
Crosscoding Through Time: Tracking Emergence & Consolidation Of Linguistic Representations Throughout LLM Pretraining
by: Bayazit, Deniz, et al.
Published: (2025)
by: Bayazit, Deniz, et al.
Published: (2025)
CAVE: Detecting and Explaining Commonsense Anomalies in Visual Environments
by: Bhagwatkar, Rishika, et al.
Published: (2025)
by: Bhagwatkar, Rishika, et al.
Published: (2025)
Complex Reasoning over Logical Queries on Commonsense Knowledge Graphs
by: Fang, Tianqing, et al.
Published: (2024)
by: Fang, Tianqing, et al.
Published: (2024)
WikiMixQA: A Multimodal Benchmark for Question Answering over Tables and Charts
by: Foroutan, Negar, et al.
Published: (2025)
by: Foroutan, Negar, et al.
Published: (2025)
MILE: A Mutation Testing Framework of In-Context Learning Systems
by: Wei, Zeming, et al.
Published: (2024)
by: Wei, Zeming, et al.
Published: (2024)
Multipole Attention for Efficient Long Context Reasoning
by: Hooper, Coleman, et al.
Published: (2025)
by: Hooper, Coleman, et al.
Published: (2025)
ConLID: Supervised Contrastive Learning for Low-Resource Language Identification
by: Foroutan, Negar, et al.
Published: (2025)
by: Foroutan, Negar, et al.
Published: (2025)
The LLM Language Network: A Neuroscientific Approach for Identifying Causally Task-Relevant Units
by: AlKhamissi, Badr, et al.
Published: (2024)
by: AlKhamissi, Badr, et al.
Published: (2024)
Brain-Like Language Processing via a Shallow Untrained Multihead Attention Network
by: AlKhamissi, Badr, et al.
Published: (2024)
by: AlKhamissi, Badr, et al.
Published: (2024)
Scaling Context, Not Parameters: Training a Compact 7B Language Model for Efficient Long-Context Processing
by: Wu, Chen, et al.
Published: (2025)
by: Wu, Chen, et al.
Published: (2025)
Let's (not) just put things in Context: Test-Time Training for Long-Context LLMs
by: Bansal, Rachit, et al.
Published: (2025)
by: Bansal, Rachit, et al.
Published: (2025)
Helpful to a Fault: Measuring Illicit Assistance in Multi-Turn, Multilingual LLM Agents
by: Talokar, Nivya, et al.
Published: (2026)
by: Talokar, Nivya, et al.
Published: (2026)
Revisiting Multilingual Data Mixtures in Language Model Pretraining
by: Foroutan, Negar, et al.
Published: (2025)
by: Foroutan, Negar, et al.
Published: (2025)
Tracking the Limits of Knowledge Propagation: How LLMs Fail at Multi-Step Reasoning with Conflicting Knowledge
by: Feng, Yiyang, et al.
Published: (2026)
by: Feng, Yiyang, et al.
Published: (2026)
DELTA: Dynamic Layer-Aware Token Attention for Efficient Long-Context Reasoning
by: Zarch, Hossein Entezari, et al.
Published: (2025)
by: Zarch, Hossein Entezari, et al.
Published: (2025)
Every Attention Matters: An Efficient Hybrid Architecture for Long-Context Reasoning
by: Ling Team, et al.
Published: (2025)
by: Ling Team, et al.
Published: (2025)
CART: Context-Anchored Recurrent Transformer -- A Parameter-Efficient Architecture with Learned Stability
by: Capps, Chad A.
Published: (2026)
by: Capps, Chad A.
Published: (2026)
Can Performant LLMs Be Ethical? Quantifying the Impact of Web Crawling Opt-Outs
by: Fan, Dongyang, et al.
Published: (2025)
by: Fan, Dongyang, et al.
Published: (2025)
A Logical Fallacy-Informed Framework for Argument Generation
by: Mouchel, Luca, et al.
Published: (2024)
by: Mouchel, Luca, et al.
Published: (2024)
MLP Fusion: Towards Efficient Fine-tuning of Dense and Mixture-of-Experts Language Models
by: Ai, Mengting, et al.
Published: (2023)
by: Ai, Mengting, et al.
Published: (2023)
Learning to Reason from Feedback at Test-Time
by: Li, Yanyang, et al.
Published: (2025)
by: Li, Yanyang, et al.
Published: (2025)
Scalable Defense against In-the-wild Jailbreaking Attacks with Safety Context Retrieval
by: Chen, Taiye, et al.
Published: (2025)
by: Chen, Taiye, et al.
Published: (2025)
NeuroLoRA: Context-Aware Neuromodulation for Parameter-Efficient Multi-Task Adaptation
by: Yang, Yuxin, et al.
Published: (2026)
by: Yang, Yuxin, et al.
Published: (2026)
LongAct: Harnessing Intrinsic Activation Patterns for Long-Context Reinforcement Learning
by: Ping, Bowen, et al.
Published: (2026)
by: Ping, Bowen, et al.
Published: (2026)
Decoder-Hybrid-Decoder Architecture for Efficient Reasoning with Long Generation
by: Ren, Liliang, et al.
Published: (2025)
by: Ren, Liliang, et al.
Published: (2025)
Mixture of Cognitive Reasoners: Modular Reasoning with Brain-Like Specialization
by: AlKhamissi, Badr, et al.
Published: (2025)
by: AlKhamissi, Badr, et al.
Published: (2025)
Parity-Aware Byte-Pair Encoding: Improving Cross-lingual Fairness in Tokenization
by: Foroutan, Negar, et al.
Published: (2025)
by: Foroutan, Negar, et al.
Published: (2025)
Probe and Skip: Self-Predictive Token Skipping for Efficient Long-Context LLM Inference
by: Wu, Zimeng, et al.
Published: (2026)
by: Wu, Zimeng, et al.
Published: (2026)
Absorber LLM: Harnessing Causal Synchronization for Test-Time Training
by: Zhang, Zhixin, et al.
Published: (2026)
by: Zhang, Zhixin, et al.
Published: (2026)
FedMCP: Parameter-Efficient Federated Learning with Model-Contrastive Personalization
by: Zhao, Qianyi, et al.
Published: (2024)
by: Zhao, Qianyi, et al.
Published: (2024)
Longer Context, Deeper Thinking: Uncovering the Role of Long-Context Ability in Reasoning
by: Yang, Wang, et al.
Published: (2025)
by: Yang, Wang, et al.
Published: (2025)
Train Long, Think Short: Curriculum Learning for Efficient Reasoning
by: Hammoud, Hasan Abed Al Kader, et al.
Published: (2025)
by: Hammoud, Hasan Abed Al Kader, et al.
Published: (2025)
Evaluating Language Model Agency through Negotiations
by: Davidson, Tim R., et al.
Published: (2024)
by: Davidson, Tim R., et al.
Published: (2024)
Rational Metareasoning for Large Language Models
by: De Sabbata, C. Nicolò, et al.
Published: (2024)
by: De Sabbata, C. Nicolò, et al.
Published: (2024)
Exploring the Robustness of In-Context Learning with Noisy Labels
by: Cheng, Chen, et al.
Published: (2024)
by: Cheng, Chen, et al.
Published: (2024)
What Formal Languages Can Transformers Express? A Survey
by: Strobl, Lena, et al.
Published: (2023)
by: Strobl, Lena, et al.
Published: (2023)
Similar Items
-
Discovering Knowledge-Critical Subnetworks in Pretrained Language Models
by: Bayazit, Deniz, et al.
Published: (2023) -
Reliable Evaluation and Benchmarks for Statement Autoformalization
by: Poiroux, Auguste, et al.
Published: (2024) -
LLMs Are In-Context Bandit Reinforcement Learners
by: Monea, Giovanni, et al.
Published: (2024) -
GaLLoP: Gradient-based Sparse Learning on Low-Magnitude Parameters
by: Choudhary, Anand, et al.
Published: (2025) -
Crosscoding Through Time: Tracking Emergence & Consolidation Of Linguistic Representations Throughout LLM Pretraining
by: Bayazit, Deniz, et al.
Published: (2025)