Saved in:
| Main Authors: | Pan, Yangchen, Ying, Qizhen, Torr, Philip, Liu, Bo |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.09190 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
An MRP Formulation for Supervised Learning: Generalized Temporal Difference Learning Models
by: Pan, Yangchen, et al.
Published: (2024)
by: Pan, Yangchen, et al.
Published: (2024)
Measures of Variability for Risk-averse Policy Gradient
by: Luo, Yudong, et al.
Published: (2025)
by: Luo, Yudong, et al.
Published: (2025)
Reinforcement Learning in Dynamic Treatment Regimes Needs Critical Reexamination
by: Luo, Zhiyao, et al.
Published: (2024)
by: Luo, Zhiyao, et al.
Published: (2024)
Real-Fake: Effective Training Data Synthesis Through Distribution Matching
by: Yuan, Jianhao, et al.
Published: (2023)
by: Yuan, Jianhao, et al.
Published: (2023)
ResidualDroppath: Enhancing Feature Reuse over Residual Connections
by: Park, Sejik
Published: (2024)
by: Park, Sejik
Published: (2024)
DTR-Bench: An in silico Environment and Benchmark Platform for Reinforcement Learning Based Dynamic Treatment Regime
by: Luo, Zhiyao, et al.
Published: (2024)
by: Luo, Zhiyao, et al.
Published: (2024)
Distinguishable Deletion: Unifying Knowledge Erasure and Refusal for Large Language Model Unlearning
by: Yang, Puning, et al.
Published: (2026)
by: Yang, Puning, et al.
Published: (2026)
Towards Interpretable Sequence Continuation: Analyzing Shared Circuits in Large Language Models
by: Lan, Michael, et al.
Published: (2023)
by: Lan, Michael, et al.
Published: (2023)
Hierarchical Reinforcement Learning for Swarm Confrontation with High Uncertainty
by: Wu, Qizhen, et al.
Published: (2024)
by: Wu, Qizhen, et al.
Published: (2024)
Mitigating Gradient Overlap in Deep Residual Networks with Gradient Normalization for Improved Non-Convex Optimization
by: Yun, Juyoung
Published: (2024)
by: Yun, Juyoung
Published: (2024)
Prompting a Pretrained Transformer Can Be a Universal Approximator
by: Petrov, Aleksandar, et al.
Published: (2024)
by: Petrov, Aleksandar, et al.
Published: (2024)
Ablate and Rescue: A Causal Analysis of Residual Stream Hyper-Connections
by: Peng, William, et al.
Published: (2026)
by: Peng, William, et al.
Published: (2026)
Understanding Reasoning in Thinking Language Models via Steering Vectors
by: Venhoff, Constantin, et al.
Published: (2025)
by: Venhoff, Constantin, et al.
Published: (2025)
DEEDEE: Fast and Scalable Out-of-Distribution Dynamics Detection
by: Aljaafari, Tala, et al.
Published: (2025)
by: Aljaafari, Tala, et al.
Published: (2025)
Base Models Know How to Reason, Thinking Models Learn When
by: Venhoff, Constantin, et al.
Published: (2025)
by: Venhoff, Constantin, et al.
Published: (2025)
Gradient Regularized Natural Gradients
by: Dash, Satya Prakash, et al.
Published: (2026)
by: Dash, Satya Prakash, et al.
Published: (2026)
Conflict-Averse Gradient Descent for Multi-task Learning
by: Liu, Bo, et al.
Published: (2021)
by: Liu, Bo, et al.
Published: (2021)
A Simple Mixture Policy Parameterization for Improving Sample Efficiency of CVaR Optimization
by: Luo, Yudong, et al.
Published: (2024)
by: Luo, Yudong, et al.
Published: (2024)
Bi-Factorial Preference Optimization: Balancing Safety-Helpfulness in Language Models
by: Zhang, Wenxuan, et al.
Published: (2024)
by: Zhang, Wenxuan, et al.
Published: (2024)
Support Vector Boosting Machine (SVBM): Enhancing Classification Performance with AdaBoost and Residual Connections
by: Lian, Junbo Jacob
Published: (2024)
by: Lian, Junbo Jacob
Published: (2024)
Select to Perfect: Imitating desired behavior from large multi-agent data
by: Franzmeyer, Tim, et al.
Published: (2024)
by: Franzmeyer, Tim, et al.
Published: (2024)
Dynamic Context Adaptation and Information Flow Control in Transformers: Introducing the Evaluator Adjuster Unit and Gated Residual Connections
by: Dhayalkar, Sahil Rajesh
Published: (2024)
by: Dhayalkar, Sahil Rajesh
Published: (2024)
TabGen-ICL: Residual-Aware In-Context Example Selection for Tabular Data Generation
by: Fang, Liancheng, et al.
Published: (2025)
by: Fang, Liancheng, et al.
Published: (2025)
MUDDFormer: Breaking Residual Bottlenecks in Transformers via Multiway Dynamic Dense Connections
by: Xiao, Da, et al.
Published: (2025)
by: Xiao, Da, et al.
Published: (2025)
Specify and Edit: Overcoming Ambiguity in Text-Based Image Editing
by: Iakovleva, Ekaterina, et al.
Published: (2024)
by: Iakovleva, Ekaterina, et al.
Published: (2024)
GoQuant: Geometric Orthogonal Residual Projection for Multiplier-Free Power-of-Two Transformer Quantization
by: Xiang, Maoyang, et al.
Published: (2026)
by: Xiang, Maoyang, et al.
Published: (2026)
Attention Sinks and Outliers in Attention Residuals
by: Luo, Haozheng, et al.
Published: (2026)
by: Luo, Haozheng, et al.
Published: (2026)
Set-based Neural Network Encoding Without Weight Tying
by: Andreis, Bruno, et al.
Published: (2023)
by: Andreis, Bruno, et al.
Published: (2023)
AutoOR: Scalably Post-training LLMs to Autoformalize Operations Research Problems
by: Motwani, Sumeet Ramesh, et al.
Published: (2026)
by: Motwani, Sumeet Ramesh, et al.
Published: (2026)
Fast Explanations via Policy Gradient-Optimized Explainer
by: Pan, Deng, et al.
Published: (2024)
by: Pan, Deng, et al.
Published: (2024)
SphUnc: Hyperspherical Uncertainty Decomposition and Causal Identification via Information Geometry
by: Fu, Rong, et al.
Published: (2026)
by: Fu, Rong, et al.
Published: (2026)
Focus On This, Not That! Steering LLMs with Adaptive Feature Specification
by: Lamb, Tom A., et al.
Published: (2024)
by: Lamb, Tom A., et al.
Published: (2024)
Quantifying Feature Space Universality Across Large Language Models via Sparse Autoencoders
by: Lan, Michael, et al.
Published: (2024)
by: Lan, Michael, et al.
Published: (2024)
Universal In-Context Approximation By Prompting Fully Recurrent Models
by: Petrov, Aleksandar, et al.
Published: (2024)
by: Petrov, Aleksandar, et al.
Published: (2024)
BudgetDraft: Acceptance-Aware Multi-View Training for Sparse-KV Speculative Decoding
by: He, Liang, et al.
Published: (2026)
by: He, Liang, et al.
Published: (2026)
Outlier Gradient Analysis: Efficiently Identifying Detrimental Training Samples for Deep Learning Models
by: Chhabra, Anshuman, et al.
Published: (2024)
by: Chhabra, Anshuman, et al.
Published: (2024)
Fed-SE: Federated Self-Evolution for Privacy-Constrained Multi-Environment LLM Agents
by: Chen, Xiang, et al.
Published: (2025)
by: Chen, Xiang, et al.
Published: (2025)
Nexus: Specialization meets Adaptability for Efficiently Training Mixture of Experts
by: Gritsch, Nikolas, et al.
Published: (2024)
by: Gritsch, Nikolas, et al.
Published: (2024)
Rethinking Safety in LLM Fine-tuning: An Optimization Perspective
by: Kim, Minseon, et al.
Published: (2025)
by: Kim, Minseon, et al.
Published: (2025)
Tackling the Non-IID Issue in Heterogeneous Federated Learning by Gradient Harmonization
by: Zhang, Xinyu, et al.
Published: (2023)
by: Zhang, Xinyu, et al.
Published: (2023)
Similar Items
-
An MRP Formulation for Supervised Learning: Generalized Temporal Difference Learning Models
by: Pan, Yangchen, et al.
Published: (2024) -
Measures of Variability for Risk-averse Policy Gradient
by: Luo, Yudong, et al.
Published: (2025) -
Reinforcement Learning in Dynamic Treatment Regimes Needs Critical Reexamination
by: Luo, Zhiyao, et al.
Published: (2024) -
Real-Fake: Effective Training Data Synthesis Through Distribution Matching
by: Yuan, Jianhao, et al.
Published: (2023) -
ResidualDroppath: Enhancing Feature Reuse over Residual Connections
by: Park, Sejik
Published: (2024)