Saved in:
| Main Authors: | Wang, Qi, Xiao, Zehao, Mao, Yixiu, Qu, Yun, Shen, Jiayi, Lv, Yiqin, Ji, Xiangyang |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2501.11039 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Fast and Robust: Task Sampling with Posterior and Diversity Synergies for Adaptive Decision-Makers in Randomized Environments
by: Qu, Yun, et al.
Published: (2025)
by: Qu, Yun, et al.
Published: (2025)
Robust Fast Adaptation from Adversarially Explicit Task Distribution Generation
by: Wang, Cheems, et al.
Published: (2024)
by: Wang, Cheems, et al.
Published: (2024)
Dynamics-Predictive Sampling for Active RL Finetuning of Large Reasoning Models
by: Mao, Yixiu, et al.
Published: (2026)
by: Mao, Yixiu, et al.
Published: (2026)
RLVR without Ineffective Samples: Group Prioritized Off-Policy Optimization for LLM Reasoning
by: Mao, Yixiu, et al.
Published: (2026)
by: Mao, Yixiu, et al.
Published: (2026)
Adaptive Neighborhood-Constrained Q Learning for Offline Reinforcement Learning
by: Mao, Yixiu, et al.
Published: (2025)
by: Mao, Yixiu, et al.
Published: (2025)
Can Prompt Difficulty be Online Predicted for Accelerating RL Finetuning of Reasoning Models?
by: Qu, Yun, et al.
Published: (2025)
by: Qu, Yun, et al.
Published: (2025)
Offline Reinforcement Learning with OOD State Correction and OOD Action Suppression
by: Mao, Yixiu, et al.
Published: (2024)
by: Mao, Yixiu, et al.
Published: (2024)
Doubly Mild Generalization for Offline Reinforcement Learning
by: Mao, Yixiu, et al.
Published: (2024)
by: Mao, Yixiu, et al.
Published: (2024)
Utility-Diversity Aware Online Batch Selection for LLM Supervised Fine-tuning
by: Zou, Heming, et al.
Published: (2025)
by: Zou, Heming, et al.
Published: (2025)
VAO: Validation-Aligned Optimization for Cross-Task Generative Auto-Bidding
by: Lv, Yiqin, et al.
Published: (2025)
by: Lv, Yiqin, et al.
Published: (2025)
Small Generalizable Prompt Predictive Models Can Steer Efficient RL Post-Training of Large Reasoning Models
by: Qu, Yun, et al.
Published: (2026)
by: Qu, Yun, et al.
Published: (2026)
Latent Reward: LLM-Empowered Credit Assignment in Episodic Reinforcement Learning
by: Qu, Yun, et al.
Published: (2024)
by: Qu, Yun, et al.
Published: (2024)
Theoretical Investigations and Practical Enhancements on Tail Task Risk Minimization in Meta Learning
by: Lv, Yiqin, et al.
Published: (2024)
by: Lv, Yiqin, et al.
Published: (2024)
GO4Align: Group Optimization for Multi-Task Alignment
by: Shen, Jiayi, et al.
Published: (2024)
by: Shen, Jiayi, et al.
Published: (2024)
Listwise Policy Optimization: Group-based RLVR as Target-Projection on the LLM Response Simplex
by: Qu, Yun, et al.
Published: (2026)
by: Qu, Yun, et al.
Published: (2026)
Enhancing Generative Auto-bidding with Offline Reward Evaluation and Policy Search
by: Mou, Zhiyu, et al.
Published: (2025)
by: Mou, Zhiyu, et al.
Published: (2025)
Beyond Model Adaptation at Test Time: A Survey
by: Xiao, Zehao, et al.
Published: (2024)
by: Xiao, Zehao, et al.
Published: (2024)
UniAutoML: A Human-Centered Framework for Unified Discriminative and Generative AutoML with Large Language Models
by: Guo, Jiayi, et al.
Published: (2024)
by: Guo, Jiayi, et al.
Published: (2024)
Probabilistic Test-Time Generalization by Variational Neighbor-Labeling
by: Ambekar, Sameer, et al.
Published: (2023)
by: Ambekar, Sameer, et al.
Published: (2023)
Task-Aware Parameter-Efficient Fine-Tuning of Large Pre-Trained Models at the Edge
by: Hu, Senkang, et al.
Published: (2025)
by: Hu, Senkang, et al.
Published: (2025)
Group & Reweight: A Novel Cost-Sensitive Approach to Mitigating Class Imbalance in Network Traffic Classification
by: Du, Wumei, et al.
Published: (2024)
by: Du, Wumei, et al.
Published: (2024)
WLFM: A Well-Logs Foundation Model for Multi-Task and Cross-Well Geological Interpretation
by: Qi, Zhenyu, et al.
Published: (2025)
by: Qi, Zhenyu, et al.
Published: (2025)
Provably Robust Adaptation for Language-Empowered Foundation Models
by: Lai, Yuni, et al.
Published: (2025)
by: Lai, Yuni, et al.
Published: (2025)
Equivariant Masked Position Prediction for Efficient Molecular Representation
by: An, Junyi, et al.
Published: (2025)
by: An, Junyi, et al.
Published: (2025)
Distributional Vision-Language Alignment by Cauchy-Schwarz Divergence
by: Yin, Wenzhe, et al.
Published: (2025)
by: Yin, Wenzhe, et al.
Published: (2025)
Weight Spectra Induced Efficient Model Adaptation
by: Si, Chongjie, et al.
Published: (2025)
by: Si, Chongjie, et al.
Published: (2025)
Sample Efficient Robot Learning in Supervised Effect Prediction Tasks
by: Eren, Mehmet Arda, et al.
Published: (2024)
by: Eren, Mehmet Arda, et al.
Published: (2024)
Neighborhood Sampling Does Not Learn the Same Graph Neural Network
by: Niu, Zehao, et al.
Published: (2025)
by: Niu, Zehao, et al.
Published: (2025)
DynaPrompt: Dynamic Test-Time Prompt Tuning
by: Xiao, Zehao, et al.
Published: (2025)
by: Xiao, Zehao, et al.
Published: (2025)
Sample and Computationally Efficient Robust Learning of Gaussian Single-Index Models
by: Wang, Puqian, et al.
Published: (2024)
by: Wang, Puqian, et al.
Published: (2024)
ChainzRule: Sample-Efficient, Robust Deep Learning Across Tabular, NLP, and Vision Tasks
by: Martnishn, Rowan
Published: (2026)
by: Martnishn, Rowan
Published: (2026)
Cross-Sample Augmented Test-Time Adaptation for Personalized Intraoperative Hypotension Prediction
by: Li, Kanxue, et al.
Published: (2025)
by: Li, Kanxue, et al.
Published: (2025)
NeuroLoRA: Context-Aware Neuromodulation for Parameter-Efficient Multi-Task Adaptation
by: Yang, Yuxin, et al.
Published: (2026)
by: Yang, Yuxin, et al.
Published: (2026)
Robust Neural Pruning with Gradient Sampling Optimization for Residual Neural Networks
by: Yun, Juyoung
Published: (2023)
by: Yun, Juyoung
Published: (2023)
DoseGNN: Improving the Performance of Deep Learning Models in Adaptive Dose-Volume Histogram Prediction through Graph Neural Networks
by: Dong, Zehao, et al.
Published: (2024)
by: Dong, Zehao, et al.
Published: (2024)
Distributionally Robust Multi-Task Reinforcement Learning via Adaptive Task Sampling
by: Corrado, Nicholas E., et al.
Published: (2026)
by: Corrado, Nicholas E., et al.
Published: (2026)
MoORE: SVD-based Model MoE-ization for Conflict- and Oblivion-Resistant Multi-Task Adaptation
by: Yuan, Shen, et al.
Published: (2025)
by: Yuan, Shen, et al.
Published: (2025)
Efficient Stitchable Task Adaptation
by: He, Haoyu, et al.
Published: (2023)
by: He, Haoyu, et al.
Published: (2023)
Reasoning-targeted Jailbreak Attacks on Large Reasoning Models via Semantic Triggers and Psychological Framing
by: Wang, Zehao, et al.
Published: (2026)
by: Wang, Zehao, et al.
Published: (2026)
SC2Arena and StarEvolve: Benchmark and Self-Improvement Framework for LLMs in Complex Decision-Making Tasks
by: Shen, Pengbo, et al.
Published: (2025)
by: Shen, Pengbo, et al.
Published: (2025)
Similar Items
-
Fast and Robust: Task Sampling with Posterior and Diversity Synergies for Adaptive Decision-Makers in Randomized Environments
by: Qu, Yun, et al.
Published: (2025) -
Robust Fast Adaptation from Adversarially Explicit Task Distribution Generation
by: Wang, Cheems, et al.
Published: (2024) -
Dynamics-Predictive Sampling for Active RL Finetuning of Large Reasoning Models
by: Mao, Yixiu, et al.
Published: (2026) -
RLVR without Ineffective Samples: Group Prioritized Off-Policy Optimization for LLM Reasoning
by: Mao, Yixiu, et al.
Published: (2026) -
Adaptive Neighborhood-Constrained Q Learning for Offline Reinforcement Learning
by: Mao, Yixiu, et al.
Published: (2025)