Guardado en:
| Autores principales: | Wang, Cheems, Lv, Yiqin, Mao, Yixiu, Qu, Yun, Xu, Yi, Ji, Xiangyang |
|---|---|
| Formato: | Preprint |
| Publicado: |
2024
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2407.19523 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Ejemplares similares
Fast and Robust: Task Sampling with Posterior and Diversity Synergies for Adaptive Decision-Makers in Randomized Environments
por: Qu, Yun, et al.
Publicado: (2025)
por: Qu, Yun, et al.
Publicado: (2025)
Model Predictive Task Sampling for Efficient and Robust Adaptation
por: Wang, Qi, et al.
Publicado: (2025)
por: Wang, Qi, et al.
Publicado: (2025)
Latent Reward: LLM-Empowered Credit Assignment in Episodic Reinforcement Learning
por: Qu, Yun, et al.
Publicado: (2024)
por: Qu, Yun, et al.
Publicado: (2024)
VAO: Validation-Aligned Optimization for Cross-Task Generative Auto-Bidding
por: Lv, Yiqin, et al.
Publicado: (2025)
por: Lv, Yiqin, et al.
Publicado: (2025)
Adaptive Neighborhood-Constrained Q Learning for Offline Reinforcement Learning
por: Mao, Yixiu, et al.
Publicado: (2025)
por: Mao, Yixiu, et al.
Publicado: (2025)
Doubly Mild Generalization for Offline Reinforcement Learning
por: Mao, Yixiu, et al.
Publicado: (2024)
por: Mao, Yixiu, et al.
Publicado: (2024)
Offline Reinforcement Learning with OOD State Correction and OOD Action Suppression
por: Mao, Yixiu, et al.
Publicado: (2024)
por: Mao, Yixiu, et al.
Publicado: (2024)
Dynamics-Predictive Sampling for Active RL Finetuning of Large Reasoning Models
por: Mao, Yixiu, et al.
Publicado: (2026)
por: Mao, Yixiu, et al.
Publicado: (2026)
RLVR without Ineffective Samples: Group Prioritized Off-Policy Optimization for LLM Reasoning
por: Mao, Yixiu, et al.
Publicado: (2026)
por: Mao, Yixiu, et al.
Publicado: (2026)
Utility-Diversity Aware Online Batch Selection for LLM Supervised Fine-tuning
por: Zou, Heming, et al.
Publicado: (2025)
por: Zou, Heming, et al.
Publicado: (2025)
Can Prompt Difficulty be Online Predicted for Accelerating RL Finetuning of Reasoning Models?
por: Qu, Yun, et al.
Publicado: (2025)
por: Qu, Yun, et al.
Publicado: (2025)
Enhancing Generative Auto-bidding with Offline Reward Evaluation and Policy Search
por: Mou, Zhiyu, et al.
Publicado: (2025)
por: Mou, Zhiyu, et al.
Publicado: (2025)
Small Generalizable Prompt Predictive Models Can Steer Efficient RL Post-Training of Large Reasoning Models
por: Qu, Yun, et al.
Publicado: (2026)
por: Qu, Yun, et al.
Publicado: (2026)
Theoretical Investigations and Practical Enhancements on Tail Task Risk Minimization in Meta Learning
por: Lv, Yiqin, et al.
Publicado: (2024)
por: Lv, Yiqin, et al.
Publicado: (2024)
Gains: Fine-grained Federated Domain Adaptation in Open Set
por: Zhong, Zhengyi, et al.
Publicado: (2025)
por: Zhong, Zhengyi, et al.
Publicado: (2025)
Listwise Policy Optimization: Group-based RLVR as Target-Projection on the LLM Response Simplex
por: Qu, Yun, et al.
Publicado: (2026)
por: Qu, Yun, et al.
Publicado: (2026)
Stop Wandering, Find the Keys: LLMs Discriminate Key States for Efficient Multi-Agent Exploration
por: Qu, Yun, et al.
Publicado: (2024)
por: Qu, Yun, et al.
Publicado: (2024)
GO4Align: Group Optimization for Multi-Task Alignment
por: Shen, Jiayi, et al.
Publicado: (2024)
por: Shen, Jiayi, et al.
Publicado: (2024)
Enhancing Adversarial Robustness via Uncertainty-Aware Distributional Adversarial Training
por: Dong, Junhao, et al.
Publicado: (2024)
por: Dong, Junhao, et al.
Publicado: (2024)
The Pitfalls and Promise of Conformal Inference Under Adversarial Attacks
por: Liu, Ziquan, et al.
Publicado: (2024)
por: Liu, Ziquan, et al.
Publicado: (2024)
Stochastic Weakly Convex Optimization Under Heavy-Tailed Noises
por: Zhu, Tianxi, et al.
Publicado: (2025)
por: Zhu, Tianxi, et al.
Publicado: (2025)
On the Effects of Adversarial Perturbations on Distribution Robustness
por: Wang, Yipei, et al.
Publicado: (2026)
por: Wang, Yipei, et al.
Publicado: (2026)
Reducing Fine-Tuning Memory Overhead by Approximate and Memory-Sharing Backpropagation
por: Yang, Yuchen, et al.
Publicado: (2024)
por: Yang, Yuchen, et al.
Publicado: (2024)
Robust Multi-Task Learning with Excess Risks
por: He, Yifei, et al.
Publicado: (2024)
por: He, Yifei, et al.
Publicado: (2024)
Ignition Phase : Standard Training for Fast Adversarial Robustness
por: Yu-Hang, Wang, et al.
Publicado: (2025)
por: Yu-Hang, Wang, et al.
Publicado: (2025)
Are Fast Methods Stable in Adversarially Robust Transfer Learning?
por: Zhao, Joshua C., et al.
Publicado: (2025)
por: Zhao, Joshua C., et al.
Publicado: (2025)
Solution for Point Tracking Task of ICCV 1st Perception Test Challenge 2023
por: Pan, Hongpeng, et al.
Publicado: (2024)
por: Pan, Hongpeng, et al.
Publicado: (2024)
Class-wise Federated Unlearning: Harnessing Active Forgetting with Teacher-Student Memory Generation
por: Li, Yuyuan, et al.
Publicado: (2023)
por: Li, Yuyuan, et al.
Publicado: (2023)
How Learning Dynamics Drive Adversarially Robust Generalization?
por: Xu, Yuelin, et al.
Publicado: (2024)
por: Xu, Yuelin, et al.
Publicado: (2024)
Score-Based Model for Low-Rank Tensor Recovery
por: Cheng, Zhengyun, et al.
Publicado: (2025)
por: Cheng, Zhengyun, et al.
Publicado: (2025)
FlyLoRA: Boosting Task Decoupling and Parameter Efficiency via Implicit Rank-Wise Mixture-of-Experts
por: Zou, Heming, et al.
Publicado: (2025)
por: Zou, Heming, et al.
Publicado: (2025)
Flow-Matching Based Refiner for Molecular Conformer Generation
por: Xu, Xiangyang, et al.
Publicado: (2025)
por: Xu, Xiangyang, et al.
Publicado: (2025)
On Generalization and Regularization via Wasserstein Distributionally Robust Optimization
por: Wu, Qinyu, et al.
Publicado: (2022)
por: Wu, Qinyu, et al.
Publicado: (2022)
Distributionally Robust Learning for Multi-source Unsupervised Domain Adaptation
por: Wang, Zhenyu, et al.
Publicado: (2023)
por: Wang, Zhenyu, et al.
Publicado: (2023)
Robust Distribution Learning with Local and Global Adversarial Corruptions
por: Nietert, Sloan, et al.
Publicado: (2024)
por: Nietert, Sloan, et al.
Publicado: (2024)
Towards Optimal Adversarial Robust Reinforcement Learning with Infinity Measurement Error
por: Li, Haoran, et al.
Publicado: (2025)
por: Li, Haoran, et al.
Publicado: (2025)
Unsupervised Data Generation for Offline Reinforcement Learning: A Perspective from Model
por: He, Shuncheng, et al.
Publicado: (2025)
por: He, Shuncheng, et al.
Publicado: (2025)
Distribution Transformers: Fast Approximate Bayesian Inference With On-The-Fly Prior Adaptation
por: Whittle, George, et al.
Publicado: (2025)
por: Whittle, George, et al.
Publicado: (2025)
Continual Domain Adversarial Adaptation via Double-Head Discriminators
por: Shen, Yan, et al.
Publicado: (2024)
por: Shen, Yan, et al.
Publicado: (2024)
DART: A Principled Approach to Adversarially Robust Unsupervised Domain Adaptation
por: Wang, Yunjuan, et al.
Publicado: (2024)
por: Wang, Yunjuan, et al.
Publicado: (2024)
Ejemplares similares
-
Fast and Robust: Task Sampling with Posterior and Diversity Synergies for Adaptive Decision-Makers in Randomized Environments
por: Qu, Yun, et al.
Publicado: (2025) -
Model Predictive Task Sampling for Efficient and Robust Adaptation
por: Wang, Qi, et al.
Publicado: (2025) -
Latent Reward: LLM-Empowered Credit Assignment in Episodic Reinforcement Learning
por: Qu, Yun, et al.
Publicado: (2024) -
VAO: Validation-Aligned Optimization for Cross-Task Generative Auto-Bidding
por: Lv, Yiqin, et al.
Publicado: (2025) -
Adaptive Neighborhood-Constrained Q Learning for Offline Reinforcement Learning
por: Mao, Yixiu, et al.
Publicado: (2025)