:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Li, Shipeng, Yang, Zhiqin, Li, Shikun, Xia, Xiaobo, Liu, Hengyu, Zhang, Xinghua, Chen, Gaode, Fang, Dong, Tai, Ying, Peng, Zhe
Format:	Preprint
Published:	2025
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2506.11480
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

GradAlign: Gradient-Aligned Data Selection for LLM Reinforcement Learning
by: Yang, Ningyuan, et al.
Published: (2026)

Transferring Annotator- and Instance-dependent Transition Matrix for Learning from Crowds
by: Li, Shikun, et al.
Published: (2023)

Prior-Informed Zeroth-Order Optimization with Adaptive Direction Alignment for Memory-Efficient LLM Fine-Tuning
by: Jin, Feihu, et al.
Published: (2026)

Pairwise Alignment Improves Graph Domain Adaptation
by: Liu, Shikun, et al.
Published: (2024)

Rethinking the Sampling Criteria in Reinforcement Learning for LLM Reasoning: A Competence-Difficulty Alignment Perspective
by: Kong, Deyang, et al.
Published: (2025)

Towards Comprehensible Recommendation with Large Language Model Fine-tuning
by: Luo, Yunze, et al.
Published: (2025)

Data Selection for Multi-turn Dialogue Instruction Tuning
by: Li, Bo, et al.
Published: (2026)

Structural Alignment Improves Graph Test-Time Adaptation
by: Hsu, Hans Hao-Hsun, et al.
Published: (2025)

Data Selection for LLM Alignment Using Fine-Grained Preferences
by: Zhang, Jia, et al.
Published: (2025)

Learning to Align, Aligning to Learn: A Unified Approach for Self-Optimized Alignment
by: Wang, Haowen, et al.
Published: (2025)

Towards Data-efficient Customer Intent Recognition with Prompt-based Learning Paradigm
by: Luo, Hengyu, et al.
Published: (2023)

Rethinking Inverse Reinforcement Learning: from Data Alignment to Task Alignment
by: Zhou, Weichao, et al.
Published: (2024)

Euclidean Distance Matrix Completion via Asymmetric Projected Gradient Descent
by: Li, Yicheng, et al.
Published: (2025)

Beyond Algorithm Evolution: An LLM-Driven Framework for the Co-Evolution of Swarm Intelligence Optimization Algorithms and Prompts
by: Cen, Shipeng, et al.
Published: (2025)

Instruction Data Selection via Answer Divergence
by: Li, Bo, et al.
Published: (2026)

Sensor Network Localization via Riemannian Conjugate Gradient and Rank Reduction: An Extended Version
by: Li, Yicheng, et al.
Published: (2024)

Aligning Data Selection with Performance: Performance-driven Reinforcement Learning for Active Learning in Object Detection
by: Liang, Zhixuan, et al.
Published: (2023)

Learning More by Seeing Less: Structure First Learning for Efficient, Transferable, and Human-Aligned Vision
by: Li, Tianqin, et al.
Published: (2025)

Back-stepping Experience Replay with Application to Model-free Reinforcement Learning for a Soft Snake Robot
by: Qi, Xinda, et al.
Published: (2024)

Clean First, Align Later: Benchmarking Preference Data Cleaning for Reliable LLM Alignment
by: Yeh, Samuel, et al.
Published: (2025)

Flatness and Gradient Alignment Are Both Necessary: Spectral-Aware Gradient-Aligned Exploration for Multi-Distribution Learning
by: Ballas, Aristotelis, et al.
Published: (2026)

AlphaAlign: Incentivizing Safety Alignment with Extremely Simplified Reinforcement Learning
by: Zhang, Yi, et al.
Published: (2025)

Conditional Equivalence of DPO and RLHF: Implicit Assumption, Failure Modes, and Provable Alignment
by: Yang, Zhiqin, et al.
Published: (2026)

Online Item Cold-Start Recommendation with Popularity-Aware Meta-Learning
by: Luo, Yunze, et al.
Published: (2024)

Less is More: Improving LLM Alignment via Preference Data Selection
by: Deng, Xun, et al.
Published: (2025)

SaRO: Enhancing LLM Safety through Reasoning-based Alignment
by: Mou, Yutao, et al.
Published: (2025)

Improving the Data-efficiency of Reinforcement Learning by Warm-starting with LLM
by: Duong, Thang, et al.
Published: (2025)

Just-In-Time Reinforcement Learning: Continual Learning in LLM Agents Without Gradient Updates
by: Li, Yibo, et al.
Published: (2026)

History Rhymes: Accelerating LLM Reinforcement Learning with RhymeRL
by: He, Jingkai, et al.
Published: (2025)

DiSA-IQL: Offline Reinforcement Learning for Robust Soft Robot Control under Distribution Shifts
by: He, Linjin, et al.
Published: (2025)

LLM-Enhanced Reinforcement Learning for Long-Term User Satisfaction in Interactive Recommendation
by: Xia, Chongjun, et al.
Published: (2026)

Getting More Juice Out of the SFT Data: Reward Learning from Human Demonstration Improves SFT for LLM Alignment
by: Li, Jiaxiang, et al.
Published: (2024)

Improving Robustness of LLM-based Speech Synthesis by Learning Monotonic Alignment
by: Neekhara, Paarth, et al.
Published: (2024)

How Alignment and Jailbreak Work: Explain LLM Safety through Intermediate Hidden States
by: Zhou, Zhenhong, et al.
Published: (2024)

FLAG-Trader: Fusion LLM-Agent with Gradient-based Reinforcement Learning for Financial Trading
by: Xiong, Guojun, et al.
Published: (2025)

AlignSAM: Aligning Segment Anything Model to Open Context via Reinforcement Learning
by: Huang, Duojun, et al.
Published: (2024)

Gradient-Based Data Valuation Improves Curriculum Learning for Game-Theoretic Motion Planning
by: Li, Shihao, et al.
Published: (2026)

GAPSL: A Gradient-Aligned Parallel Split Learning on Heterogeneous Data
by: Lin, Zheng, et al.
Published: (2026)

Learn More, Forget Less: A Gradient-Aware Data Selection Approach for LLM
by: Liu, Yibai, et al.
Published: (2025)

An Integrated Strategy for Comprehensive Characterization of Traditional Chinese Medicine (TCM) Formulas: A Case Study of Gegen‐Qinlian Decoction
by: Zhitian Peng, et al.
Published: (2025)