Saved in:
| Main Authors: | Li, Wenjun, Chen, Changyu, Varakantham, Pradeep |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2406.10479 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
On Learning Informative Trajectory Embeddings for Imitation, Classification and Regression
by: Ge, Zichang, et al.
Published: (2025)
by: Ge, Zichang, et al.
Published: (2025)
Enhancing the Hierarchical Environment Design via Generative Trajectory Modeling
by: Li, Dexun, et al.
Published: (2023)
by: Li, Dexun, et al.
Published: (2023)
Semantic Loss Guided Data Efficient Supervised Fine Tuning for Safe Responses in LLMs
by: Lu, Yuxiao, et al.
Published: (2024)
by: Lu, Yuxiao, et al.
Published: (2024)
Efficient Unsupervised Environment Design through Hierarchical Policy Representation Learning
by: Li, Dexun, et al.
Published: (2026)
by: Li, Dexun, et al.
Published: (2026)
Offline Safe Policy Optimization From Heterogeneous Feedback
by: Gong, Ze, et al.
Published: (2025)
by: Gong, Ze, et al.
Published: (2025)
Optimizing Ride-Pooling Operations with Extended Pickup and Drop-Off Flexibility
by: Jiang, Hao, et al.
Published: (2025)
by: Jiang, Hao, et al.
Published: (2025)
Fundamental Safety-Capability Trade-offs in Fine-tuning Large Language Models
by: Chen, Pin-Yu, et al.
Published: (2025)
by: Chen, Pin-Yu, et al.
Published: (2025)
UNIQ: Offline Inverse Q-learning for Avoiding Undesirable Demonstrations
by: Hoang, Huy, et al.
Published: (2024)
by: Hoang, Huy, et al.
Published: (2024)
Offline Safe Reinforcement Learning Using Trajectory Classification
by: Gong, Ze, et al.
Published: (2024)
by: Gong, Ze, et al.
Published: (2024)
On Minimizing Adversarial Counterfactual Error in Adversarial RL
by: Belaire, Roman, et al.
Published: (2024)
by: Belaire, Roman, et al.
Published: (2024)
SPRINQL: Sub-optimal Demonstrations driven Offline Imitation Learning
by: Hoang, Huy, et al.
Published: (2024)
by: Hoang, Huy, et al.
Published: (2024)
Handling Long and Richly Constrained Tasks through Constrained Hierarchical Reinforcement Learning
by: Lu, Yuxiao, et al.
Published: (2023)
by: Lu, Yuxiao, et al.
Published: (2023)
Imitate the Good and Avoid the Bad: An Incremental Approach to Safe Reinforcement Learning
by: Hoang, Huy, et al.
Published: (2023)
by: Hoang, Huy, et al.
Published: (2023)
Automatic LLM Red Teaming
by: Belaire, Roman, et al.
Published: (2025)
by: Belaire, Roman, et al.
Published: (2025)
Preserving the Privacy of Reward Functions in MDPs through Deception
by: Chirra, Shashank Reddy, et al.
Published: (2024)
by: Chirra, Shashank Reddy, et al.
Published: (2024)
Safety through feedback in Constrained RL
by: Chirra, Shashank Reddy, et al.
Published: (2024)
by: Chirra, Shashank Reddy, et al.
Published: (2024)
EduQate: Generating Adaptive Curricula through RMABs in Education Settings
by: Tio, Sidney, et al.
Published: (2024)
by: Tio, Sidney, et al.
Published: (2024)
Imitating Cost-Constrained Behaviors in Reinforcement Learning
by: Shao, Qian, et al.
Published: (2024)
by: Shao, Qian, et al.
Published: (2024)
Regret-Based Defense in Adversarial Reinforcement Learning
by: Belaire, Roman, et al.
Published: (2023)
by: Belaire, Roman, et al.
Published: (2023)
Learning What to Do and What Not To Do: Offline Imitation from Expert and Undesirable Demonstrations
by: Hoang, Huy, et al.
Published: (2025)
by: Hoang, Huy, et al.
Published: (2025)
On Discovering Algorithms for Adversarial Imitation Learning
by: Chirra, Shashank Reddy, et al.
Published: (2025)
by: Chirra, Shashank Reddy, et al.
Published: (2025)
Semi-supervised Fine-tuning for Large Language Models
by: Luo, Junyu, et al.
Published: (2024)
by: Luo, Junyu, et al.
Published: (2024)
Privacy-preserving Fine-tuning of Large Language Models through Flatness
by: Chen, Tiejin, et al.
Published: (2024)
by: Chen, Tiejin, et al.
Published: (2024)
Revisiting the Travel Planning Capabilities of Large Language Models
by: Zhang, Bo-Wen, et al.
Published: (2026)
by: Zhang, Bo-Wen, et al.
Published: (2026)
Unlocking Reasoning Capability on Machine Translation in Large Language Models
by: Rajaee, Sara, et al.
Published: (2026)
by: Rajaee, Sara, et al.
Published: (2026)
Refine Large Language Model Fine-tuning via Instruction Vector
by: Jiang, Gangwei, et al.
Published: (2024)
by: Jiang, Gangwei, et al.
Published: (2024)
Privacy in Fine-tuning Large Language Models: Attacks, Defenses, and Future Directions
by: Du, Hao, et al.
Published: (2024)
by: Du, Hao, et al.
Published: (2024)
Federated Fine-tuning of Large Language Models under Heterogeneous Tasks and Client Resources
by: Bai, Jiamu, et al.
Published: (2024)
by: Bai, Jiamu, et al.
Published: (2024)
KnowTuning: Knowledge-aware Fine-tuning for Large Language Models
by: Lyu, Yougang, et al.
Published: (2024)
by: Lyu, Yougang, et al.
Published: (2024)
Preserving Diversity in Supervised Fine-Tuning of Large Language Models
by: Li, Ziniu, et al.
Published: (2024)
by: Li, Ziniu, et al.
Published: (2024)
Fine-tuning Large Language Model for Automated Algorithm Design
by: Liu, Fei, et al.
Published: (2025)
by: Liu, Fei, et al.
Published: (2025)
Are Large Brainwave Foundation Models Capable Yet? Insights from Fine-tuning
by: Lee, Na, et al.
Published: (2025)
by: Lee, Na, et al.
Published: (2025)
Panacea: Mitigating Harmful Fine-tuning for Large Language Models via Post-fine-tuning Perturbation
by: Wang, Yibo, et al.
Published: (2025)
by: Wang, Yibo, et al.
Published: (2025)
Demystifying Instruction Mixing for Fine-tuning Large Language Models
by: Wang, Renxi, et al.
Published: (2023)
by: Wang, Renxi, et al.
Published: (2023)
MEUV: Achieving Fine-Grained Capability Activation in Large Language Models via Mutually Exclusive Unlock Vectors
by: Tong, Xin, et al.
Published: (2025)
by: Tong, Xin, et al.
Published: (2025)
Cooperative Strategic Planning Enhances Reasoning Capabilities in Large Language Models
by: Wang, Danqing, et al.
Published: (2024)
by: Wang, Danqing, et al.
Published: (2024)
Sparse is Enough in Fine-tuning Pre-trained Large Language Models
by: Song, Weixi, et al.
Published: (2023)
by: Song, Weixi, et al.
Published: (2023)
Selecting Large Language Model to Fine-tune via Rectified Scaling Law
by: Lin, Haowei, et al.
Published: (2024)
by: Lin, Haowei, et al.
Published: (2024)
Unlocking Prompt Infilling Capability for Diffusion Language Models
by: Fujinuma, Yoshinari, et al.
Published: (2026)
by: Fujinuma, Yoshinari, et al.
Published: (2026)
JudgeLM: Fine-tuned Large Language Models are Scalable Judges
by: Zhu, Lianghui, et al.
Published: (2023)
by: Zhu, Lianghui, et al.
Published: (2023)
Similar Items
-
On Learning Informative Trajectory Embeddings for Imitation, Classification and Regression
by: Ge, Zichang, et al.
Published: (2025) -
Enhancing the Hierarchical Environment Design via Generative Trajectory Modeling
by: Li, Dexun, et al.
Published: (2023) -
Semantic Loss Guided Data Efficient Supervised Fine Tuning for Safe Responses in LLMs
by: Lu, Yuxiao, et al.
Published: (2024) -
Efficient Unsupervised Environment Design through Hierarchical Policy Representation Learning
by: Li, Dexun, et al.
Published: (2026) -
Offline Safe Policy Optimization From Heterogeneous Feedback
by: Gong, Ze, et al.
Published: (2025)