Saved in:
| Main Authors: | Kruszewski, Germán, Erbacher, Pierre, Rozen, Jos, Dymetman, Marc |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2512.05962 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Compositional preference models for aligning LMs
by: Go, Dongyoung, et al.
Published: (2023)
by: Go, Dongyoung, et al.
Published: (2023)
FaST: Feature-aware Sampling and Tuning for Personalized Preference Alignment with Limited Data
by: Thonet, Thibaut, et al.
Published: (2025)
by: Thonet, Thibaut, et al.
Published: (2025)
ELITR-Bench: A Meeting Assistant Benchmark for Long-Context Language Models
by: Thonet, Thibaut, et al.
Published: (2024)
by: Thonet, Thibaut, et al.
Published: (2024)
Binary Rewards and Reinforcement Learning: Fundamental Challenges
by: Dymetman, Marc
Published: (2026)
by: Dymetman, Marc
Published: (2026)
Exponential families from a single KL identity
by: Dymetman, Marc
Published: (2026)
by: Dymetman, Marc
Published: (2026)
Automated Machine Learning for Remaining Useful Life Predictions
by: Zöller, Marc-André, et al.
Published: (2023)
by: Zöller, Marc-André, et al.
Published: (2023)
Trustworthy AI Must Account for Interactions
by: Cresswell, Jesse C.
Published: (2025)
by: Cresswell, Jesse C.
Published: (2025)
Knowledge Distillation Must Account for What It Loses
by: Wang, Wenshuo
Published: (2026)
by: Wang, Wenshuo
Published: (2026)
VAR-MATH: Probing True Mathematical Reasoning in LLMS via Symbolic Multi-Instance Benchmarks
by: Yao, Jian, et al.
Published: (2025)
by: Yao, Jian, et al.
Published: (2025)
Data Doping or True Intelligence? Evaluating the Transferability of Injected Knowledge in LLMs
by: Jan, Essa, et al.
Published: (2025)
by: Jan, Essa, et al.
Published: (2025)
Towards Label-Free Biological Reasoning Synthetic Dataset Creation via Uncertainty Filtering
by: Stoisser, Josefa Lia, et al.
Published: (2025)
by: Stoisser, Josefa Lia, et al.
Published: (2025)
The Cell Must Go On: Agar.io for Continual Reinforcement Learning
by: Mohamed, Mohamed A., et al.
Published: (2025)
by: Mohamed, Mohamed A., et al.
Published: (2025)
Can Machines Learn the True Probabilities?
by: Kim, Jinsook
Published: (2024)
by: Kim, Jinsook
Published: (2024)
Beyond High-Entropy Exploration: Correctness-Aware Low-Entropy Segment-Based Advantage Shaping for Reasoning LLMs
by: Chen, Xinzhu, et al.
Published: (2025)
by: Chen, Xinzhu, et al.
Published: (2025)
Position: A Theory of Deep Learning Must Include Compositional Sparsity
by: Danhofer, David A., et al.
Published: (2025)
by: Danhofer, David A., et al.
Published: (2025)
How Far Are We from True Unlearnability?
by: Ye, Kai, et al.
Published: (2025)
by: Ye, Kai, et al.
Published: (2025)
ACCO: Accumulate While You Communicate for Communication-Overlapped Sharded LLM Training
by: Nabli, Adel, et al.
Published: (2024)
by: Nabli, Adel, et al.
Published: (2024)
Scaling Up RL: Unlocking Diverse Reasoning in LLMs via Prolonged Training
by: Liu, Mingjie, et al.
Published: (2025)
by: Liu, Mingjie, et al.
Published: (2025)
Simulating the Unseen: Crash Prediction Must Learn from What Did Not Happen
by: Li, Zihao, et al.
Published: (2025)
by: Li, Zihao, et al.
Published: (2025)
True Knowledge Comes from Practice: Aligning LLMs with Embodied Environments via Reinforcement Learning
by: Tan, Weihao, et al.
Published: (2024)
by: Tan, Weihao, et al.
Published: (2024)
Unified Gradient-Based Machine Unlearning with Remain Geometry Enhancement
by: Huang, Zhehao, et al.
Published: (2024)
by: Huang, Zhehao, et al.
Published: (2024)
Mixup Domain Adaptations for Dynamic Remaining Useful Life Predictions
by: Furqon, Muhammad Tanzil, et al.
Published: (2024)
by: Furqon, Muhammad Tanzil, et al.
Published: (2024)
Data Diversity as Implicit Regularization: How Does Diversity Shape the Weight Space of Deep Neural Networks?
by: Ba, Yang, et al.
Published: (2024)
by: Ba, Yang, et al.
Published: (2024)
FROST: Filtering Reasoning Outliers with Attention for Efficient Reasoning
by: Luo, Haozheng, et al.
Published: (2026)
by: Luo, Haozheng, et al.
Published: (2026)
Actionable Interpretability Must Be Defined in Terms of Symmetries
by: Barbiero, Pietro, et al.
Published: (2026)
by: Barbiero, Pietro, et al.
Published: (2026)
Evidential Domain Adaptation for Remaining Useful Life Prediction with Incomplete Degradation
by: Hou, Yubo, et al.
Published: (2026)
by: Hou, Yubo, et al.
Published: (2026)
ReConcile: Round-Table Conference Improves Reasoning via Consensus among Diverse LLMs
by: Chen, Justin Chih-Yao, et al.
Published: (2023)
by: Chen, Justin Chih-Yao, et al.
Published: (2023)
High Noise Scheduling is a Must
by: Gokmen, Mahmut S., et al.
Published: (2024)
by: Gokmen, Mahmut S., et al.
Published: (2024)
RADAR: Reasoning-Ability and Difficulty-Aware Routing for Reasoning LLMs
by: Fernandez, Nigel, et al.
Published: (2025)
by: Fernandez, Nigel, et al.
Published: (2025)
ReasonCACHE: Teaching LLMs To Reason Without Weight Updates
by: Gupta, Sharut, et al.
Published: (2026)
by: Gupta, Sharut, et al.
Published: (2026)
Drop the Act: Probe-Filtered RL for Faithful Chain-of-Thought Reasoning
by: Parekh, Swapnil
Published: (2026)
by: Parekh, Swapnil
Published: (2026)
Instruction Diversity Drives Generalization To Unseen Tasks
by: Zhang, Dylan, et al.
Published: (2024)
by: Zhang, Dylan, et al.
Published: (2024)
Temporal Sampling for Forgotten Reasoning in LLMs
by: Li, Yuetai, et al.
Published: (2025)
by: Li, Yuetai, et al.
Published: (2025)
On the Empirical Complexity of Reasoning and Planning in LLMs
by: Kang, Liwei, et al.
Published: (2024)
by: Kang, Liwei, et al.
Published: (2024)
Using Synthetic Data to estimate the True Error is theoretically and practically doable
by: Thanh, Hai Hoang, et al.
Published: (2025)
by: Thanh, Hai Hoang, et al.
Published: (2025)
TBBC: Predict True Bacteraemia in Blood Cultures via Deep Learning
by: Sam, Kira
Published: (2024)
by: Sam, Kira
Published: (2024)
Test-time Diverse Reasoning by Riemannian Activation Steering
by: Khanh, Ly Tran Ho, et al.
Published: (2025)
by: Khanh, Ly Tran Ho, et al.
Published: (2025)
CNN-LSTM Hybrid Deep Learning Model for Remaining Useful Life Estimation
by: G, Muthukumar, et al.
Published: (2024)
by: G, Muthukumar, et al.
Published: (2024)
Matryoshka Pilot: Learning to Drive Black-Box LLMs with LLMs
by: Li, Changhao, et al.
Published: (2024)
by: Li, Changhao, et al.
Published: (2024)
The Kinetics of Reasoning: How Chain-of-Thought Shapes Learning in Transformers?
by: Pengmei, Zihan, et al.
Published: (2025)
by: Pengmei, Zihan, et al.
Published: (2025)
Similar Items
-
Compositional preference models for aligning LMs
by: Go, Dongyoung, et al.
Published: (2023) -
FaST: Feature-aware Sampling and Tuning for Personalized Preference Alignment with Limited Data
by: Thonet, Thibaut, et al.
Published: (2025) -
ELITR-Bench: A Meeting Assistant Benchmark for Long-Context Language Models
by: Thonet, Thibaut, et al.
Published: (2024) -
Binary Rewards and Reinforcement Learning: Fundamental Challenges
by: Dymetman, Marc
Published: (2026) -
Exponential families from a single KL identity
by: Dymetman, Marc
Published: (2026)