Saved in:
| Main Authors: | Li, Shipeng, Yang, Zhiqin, Li, Shikun, Xia, Xiaobo, Liu, Hengyu, Zhang, Xinghua, Chen, Gaode, Fang, Dong, Tai, Ying, Peng, Zhe |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2506.11480 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
GradAlign: Gradient-Aligned Data Selection for LLM Reinforcement Learning
by: Yang, Ningyuan, et al.
Published: (2026)
by: Yang, Ningyuan, et al.
Published: (2026)
Transferring Annotator- and Instance-dependent Transition Matrix for Learning from Crowds
by: Li, Shikun, et al.
Published: (2023)
by: Li, Shikun, et al.
Published: (2023)
Prior-Informed Zeroth-Order Optimization with Adaptive Direction Alignment for Memory-Efficient LLM Fine-Tuning
by: Jin, Feihu, et al.
Published: (2026)
by: Jin, Feihu, et al.
Published: (2026)
Pairwise Alignment Improves Graph Domain Adaptation
by: Liu, Shikun, et al.
Published: (2024)
by: Liu, Shikun, et al.
Published: (2024)
Rethinking the Sampling Criteria in Reinforcement Learning for LLM Reasoning: A Competence-Difficulty Alignment Perspective
by: Kong, Deyang, et al.
Published: (2025)
by: Kong, Deyang, et al.
Published: (2025)
Towards Comprehensible Recommendation with Large Language Model Fine-tuning
by: Luo, Yunze, et al.
Published: (2025)
by: Luo, Yunze, et al.
Published: (2025)
Data Selection for Multi-turn Dialogue Instruction Tuning
by: Li, Bo, et al.
Published: (2026)
by: Li, Bo, et al.
Published: (2026)
Structural Alignment Improves Graph Test-Time Adaptation
by: Hsu, Hans Hao-Hsun, et al.
Published: (2025)
by: Hsu, Hans Hao-Hsun, et al.
Published: (2025)
Data Selection for LLM Alignment Using Fine-Grained Preferences
by: Zhang, Jia, et al.
Published: (2025)
by: Zhang, Jia, et al.
Published: (2025)
Learning to Align, Aligning to Learn: A Unified Approach for Self-Optimized Alignment
by: Wang, Haowen, et al.
Published: (2025)
by: Wang, Haowen, et al.
Published: (2025)
Towards Data-efficient Customer Intent Recognition with Prompt-based Learning Paradigm
by: Luo, Hengyu, et al.
Published: (2023)
by: Luo, Hengyu, et al.
Published: (2023)
Rethinking Inverse Reinforcement Learning: from Data Alignment to Task Alignment
by: Zhou, Weichao, et al.
Published: (2024)
by: Zhou, Weichao, et al.
Published: (2024)
Euclidean Distance Matrix Completion via Asymmetric Projected Gradient Descent
by: Li, Yicheng, et al.
Published: (2025)
by: Li, Yicheng, et al.
Published: (2025)
Beyond Algorithm Evolution: An LLM-Driven Framework for the Co-Evolution of Swarm Intelligence Optimization Algorithms and Prompts
by: Cen, Shipeng, et al.
Published: (2025)
by: Cen, Shipeng, et al.
Published: (2025)
Instruction Data Selection via Answer Divergence
by: Li, Bo, et al.
Published: (2026)
by: Li, Bo, et al.
Published: (2026)
Sensor Network Localization via Riemannian Conjugate Gradient and Rank Reduction: An Extended Version
by: Li, Yicheng, et al.
Published: (2024)
by: Li, Yicheng, et al.
Published: (2024)
Aligning Data Selection with Performance: Performance-driven Reinforcement Learning for Active Learning in Object Detection
by: Liang, Zhixuan, et al.
Published: (2023)
by: Liang, Zhixuan, et al.
Published: (2023)
Learning More by Seeing Less: Structure First Learning for Efficient, Transferable, and Human-Aligned Vision
by: Li, Tianqin, et al.
Published: (2025)
by: Li, Tianqin, et al.
Published: (2025)
Back-stepping Experience Replay with Application to Model-free Reinforcement Learning for a Soft Snake Robot
by: Qi, Xinda, et al.
Published: (2024)
by: Qi, Xinda, et al.
Published: (2024)
Clean First, Align Later: Benchmarking Preference Data Cleaning for Reliable LLM Alignment
by: Yeh, Samuel, et al.
Published: (2025)
by: Yeh, Samuel, et al.
Published: (2025)
Flatness and Gradient Alignment Are Both Necessary: Spectral-Aware Gradient-Aligned Exploration for Multi-Distribution Learning
by: Ballas, Aristotelis, et al.
Published: (2026)
by: Ballas, Aristotelis, et al.
Published: (2026)
AlphaAlign: Incentivizing Safety Alignment with Extremely Simplified Reinforcement Learning
by: Zhang, Yi, et al.
Published: (2025)
by: Zhang, Yi, et al.
Published: (2025)
Conditional Equivalence of DPO and RLHF: Implicit Assumption, Failure Modes, and Provable Alignment
by: Yang, Zhiqin, et al.
Published: (2026)
by: Yang, Zhiqin, et al.
Published: (2026)
Online Item Cold-Start Recommendation with Popularity-Aware Meta-Learning
by: Luo, Yunze, et al.
Published: (2024)
by: Luo, Yunze, et al.
Published: (2024)
Less is More: Improving LLM Alignment via Preference Data Selection
by: Deng, Xun, et al.
Published: (2025)
by: Deng, Xun, et al.
Published: (2025)
SaRO: Enhancing LLM Safety through Reasoning-based Alignment
by: Mou, Yutao, et al.
Published: (2025)
by: Mou, Yutao, et al.
Published: (2025)
Improving the Data-efficiency of Reinforcement Learning by Warm-starting with LLM
by: Duong, Thang, et al.
Published: (2025)
by: Duong, Thang, et al.
Published: (2025)
Just-In-Time Reinforcement Learning: Continual Learning in LLM Agents Without Gradient Updates
by: Li, Yibo, et al.
Published: (2026)
by: Li, Yibo, et al.
Published: (2026)
History Rhymes: Accelerating LLM Reinforcement Learning with RhymeRL
by: He, Jingkai, et al.
Published: (2025)
by: He, Jingkai, et al.
Published: (2025)
DiSA-IQL: Offline Reinforcement Learning for Robust Soft Robot Control under Distribution Shifts
by: He, Linjin, et al.
Published: (2025)
by: He, Linjin, et al.
Published: (2025)
LLM-Enhanced Reinforcement Learning for Long-Term User Satisfaction in Interactive Recommendation
by: Xia, Chongjun, et al.
Published: (2026)
by: Xia, Chongjun, et al.
Published: (2026)
Getting More Juice Out of the SFT Data: Reward Learning from Human Demonstration Improves SFT for LLM Alignment
by: Li, Jiaxiang, et al.
Published: (2024)
by: Li, Jiaxiang, et al.
Published: (2024)
Improving Robustness of LLM-based Speech Synthesis by Learning Monotonic Alignment
by: Neekhara, Paarth, et al.
Published: (2024)
by: Neekhara, Paarth, et al.
Published: (2024)
How Alignment and Jailbreak Work: Explain LLM Safety through Intermediate Hidden States
by: Zhou, Zhenhong, et al.
Published: (2024)
by: Zhou, Zhenhong, et al.
Published: (2024)
FLAG-Trader: Fusion LLM-Agent with Gradient-based Reinforcement Learning for Financial Trading
by: Xiong, Guojun, et al.
Published: (2025)
by: Xiong, Guojun, et al.
Published: (2025)
AlignSAM: Aligning Segment Anything Model to Open Context via Reinforcement Learning
by: Huang, Duojun, et al.
Published: (2024)
by: Huang, Duojun, et al.
Published: (2024)
Gradient-Based Data Valuation Improves Curriculum Learning for Game-Theoretic Motion Planning
by: Li, Shihao, et al.
Published: (2026)
by: Li, Shihao, et al.
Published: (2026)
GAPSL: A Gradient-Aligned Parallel Split Learning on Heterogeneous Data
by: Lin, Zheng, et al.
Published: (2026)
by: Lin, Zheng, et al.
Published: (2026)
Learn More, Forget Less: A Gradient-Aware Data Selection Approach for LLM
by: Liu, Yibai, et al.
Published: (2025)
by: Liu, Yibai, et al.
Published: (2025)
An Integrated Strategy for Comprehensive Characterization of Traditional Chinese Medicine (TCM) Formulas: A Case Study of Gegen‐Qinlian Decoction
by: Zhitian Peng, et al.
Published: (2025)
by: Zhitian Peng, et al.
Published: (2025)
Similar Items
-
GradAlign: Gradient-Aligned Data Selection for LLM Reinforcement Learning
by: Yang, Ningyuan, et al.
Published: (2026) -
Transferring Annotator- and Instance-dependent Transition Matrix for Learning from Crowds
by: Li, Shikun, et al.
Published: (2023) -
Prior-Informed Zeroth-Order Optimization with Adaptive Direction Alignment for Memory-Efficient LLM Fine-Tuning
by: Jin, Feihu, et al.
Published: (2026) -
Pairwise Alignment Improves Graph Domain Adaptation
by: Liu, Shikun, et al.
Published: (2024) -
Rethinking the Sampling Criteria in Reinforcement Learning for LLM Reasoning: A Competence-Difficulty Alignment Perspective
by: Kong, Deyang, et al.
Published: (2025)