:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Pan, Yangchen, Ying, Qizhen, Torr, Philip, Liu, Bo
Format:	Preprint
Published:	2026
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2602.09190
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

An MRP Formulation for Supervised Learning: Generalized Temporal Difference Learning Models
by: Pan, Yangchen, et al.
Published: (2024)

Measures of Variability for Risk-averse Policy Gradient
by: Luo, Yudong, et al.
Published: (2025)

Reinforcement Learning in Dynamic Treatment Regimes Needs Critical Reexamination
by: Luo, Zhiyao, et al.
Published: (2024)

Real-Fake: Effective Training Data Synthesis Through Distribution Matching
by: Yuan, Jianhao, et al.
Published: (2023)

ResidualDroppath: Enhancing Feature Reuse over Residual Connections
by: Park, Sejik
Published: (2024)

DTR-Bench: An in silico Environment and Benchmark Platform for Reinforcement Learning Based Dynamic Treatment Regime
by: Luo, Zhiyao, et al.
Published: (2024)

Distinguishable Deletion: Unifying Knowledge Erasure and Refusal for Large Language Model Unlearning
by: Yang, Puning, et al.
Published: (2026)

Towards Interpretable Sequence Continuation: Analyzing Shared Circuits in Large Language Models
by: Lan, Michael, et al.
Published: (2023)

Hierarchical Reinforcement Learning for Swarm Confrontation with High Uncertainty
by: Wu, Qizhen, et al.
Published: (2024)

Mitigating Gradient Overlap in Deep Residual Networks with Gradient Normalization for Improved Non-Convex Optimization
by: Yun, Juyoung
Published: (2024)

Prompting a Pretrained Transformer Can Be a Universal Approximator
by: Petrov, Aleksandar, et al.
Published: (2024)

Ablate and Rescue: A Causal Analysis of Residual Stream Hyper-Connections
by: Peng, William, et al.
Published: (2026)

Understanding Reasoning in Thinking Language Models via Steering Vectors
by: Venhoff, Constantin, et al.
Published: (2025)

DEEDEE: Fast and Scalable Out-of-Distribution Dynamics Detection
by: Aljaafari, Tala, et al.
Published: (2025)

Base Models Know How to Reason, Thinking Models Learn When
by: Venhoff, Constantin, et al.
Published: (2025)

Gradient Regularized Natural Gradients
by: Dash, Satya Prakash, et al.
Published: (2026)

Conflict-Averse Gradient Descent for Multi-task Learning
by: Liu, Bo, et al.
Published: (2021)

A Simple Mixture Policy Parameterization for Improving Sample Efficiency of CVaR Optimization
by: Luo, Yudong, et al.
Published: (2024)

Bi-Factorial Preference Optimization: Balancing Safety-Helpfulness in Language Models
by: Zhang, Wenxuan, et al.
Published: (2024)

Support Vector Boosting Machine (SVBM): Enhancing Classification Performance with AdaBoost and Residual Connections
by: Lian, Junbo Jacob
Published: (2024)

Select to Perfect: Imitating desired behavior from large multi-agent data
by: Franzmeyer, Tim, et al.
Published: (2024)

Dynamic Context Adaptation and Information Flow Control in Transformers: Introducing the Evaluator Adjuster Unit and Gated Residual Connections
by: Dhayalkar, Sahil Rajesh
Published: (2024)

TabGen-ICL: Residual-Aware In-Context Example Selection for Tabular Data Generation
by: Fang, Liancheng, et al.
Published: (2025)

MUDDFormer: Breaking Residual Bottlenecks in Transformers via Multiway Dynamic Dense Connections
by: Xiao, Da, et al.
Published: (2025)

Specify and Edit: Overcoming Ambiguity in Text-Based Image Editing
by: Iakovleva, Ekaterina, et al.
Published: (2024)

GoQuant: Geometric Orthogonal Residual Projection for Multiplier-Free Power-of-Two Transformer Quantization
by: Xiang, Maoyang, et al.
Published: (2026)

Attention Sinks and Outliers in Attention Residuals
by: Luo, Haozheng, et al.
Published: (2026)

Set-based Neural Network Encoding Without Weight Tying
by: Andreis, Bruno, et al.
Published: (2023)

AutoOR: Scalably Post-training LLMs to Autoformalize Operations Research Problems
by: Motwani, Sumeet Ramesh, et al.
Published: (2026)

Fast Explanations via Policy Gradient-Optimized Explainer
by: Pan, Deng, et al.
Published: (2024)

SphUnc: Hyperspherical Uncertainty Decomposition and Causal Identification via Information Geometry
by: Fu, Rong, et al.
Published: (2026)

Focus On This, Not That! Steering LLMs with Adaptive Feature Specification
by: Lamb, Tom A., et al.
Published: (2024)

Quantifying Feature Space Universality Across Large Language Models via Sparse Autoencoders
by: Lan, Michael, et al.
Published: (2024)

Universal In-Context Approximation By Prompting Fully Recurrent Models
by: Petrov, Aleksandar, et al.
Published: (2024)

BudgetDraft: Acceptance-Aware Multi-View Training for Sparse-KV Speculative Decoding
by: He, Liang, et al.
Published: (2026)

Outlier Gradient Analysis: Efficiently Identifying Detrimental Training Samples for Deep Learning Models
by: Chhabra, Anshuman, et al.
Published: (2024)

Fed-SE: Federated Self-Evolution for Privacy-Constrained Multi-Environment LLM Agents
by: Chen, Xiang, et al.
Published: (2025)

Nexus: Specialization meets Adaptability for Efficiently Training Mixture of Experts
by: Gritsch, Nikolas, et al.
Published: (2024)

Rethinking Safety in LLM Fine-tuning: An Optimization Perspective
by: Kim, Minseon, et al.
Published: (2025)

Tackling the Non-IID Issue in Heterogeneous Federated Learning by Gradient Harmonization
by: Zhang, Xinyu, et al.
Published: (2023)