Saved in:
| Main Author: | Mineiro, Paul |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2406.04516 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Flow-DPO: Improving LLM Mathematical Reasoning through Online Multi-Agent Learning
by: Deng, Yihe, et al.
Published: (2024)
by: Deng, Yihe, et al.
Published: (2024)
ReinFlow: Fine-tuning Flow Matching Policy with Online Reinforcement Learning
by: Zhang, Tonghe, et al.
Published: (2025)
by: Zhang, Tonghe, et al.
Published: (2025)
Active, anytime-valid risk controlling prediction sets
by: Xu, Ziyu, et al.
Published: (2024)
by: Xu, Ziyu, et al.
Published: (2024)
Memento: Fine-tuning LLM Agents without Fine-tuning LLMs
by: Zhou, Huichi, et al.
Published: (2025)
by: Zhou, Huichi, et al.
Published: (2025)
$π_\texttt{RL}$: Online RL Fine-tuning for Flow-based Vision-Language-Action Models
by: Chen, Kang, et al.
Published: (2025)
by: Chen, Kang, et al.
Published: (2025)
Efficient Contextual Bandits with Uninformed Feedback Graphs
by: Zhang, Mengxiao, et al.
Published: (2024)
by: Zhang, Mengxiao, et al.
Published: (2024)
Provably Efficient Interactive-Grounded Learning with Personalized Reward
by: Zhang, Mengxiao, et al.
Published: (2024)
by: Zhang, Mengxiao, et al.
Published: (2024)
Interaction-Grounded Learning for Contextual Markov Decision Processes with Personalized Feedback
by: Zhang, Mengxiao, et al.
Published: (2026)
by: Zhang, Mengxiao, et al.
Published: (2026)
Aligning LLM Agents by Learning Latent Preference from User Edits
by: Gao, Ge, et al.
Published: (2024)
by: Gao, Ge, et al.
Published: (2024)
TuneComp: Joint Fine-tuning and Compression for Large Foundation Models
by: Chen, Xiangyu, et al.
Published: (2025)
by: Chen, Xiangyu, et al.
Published: (2025)
Fine-tuning Flow Matching Generative Models with Intermediate Feedback
by: Fan, Jiajun, et al.
Published: (2025)
by: Fan, Jiajun, et al.
Published: (2025)
The Importance of Online Data: Understanding Preference Fine-tuning via Coverage
by: Song, Yuda, et al.
Published: (2024)
by: Song, Yuda, et al.
Published: (2024)
Utility-Diversity Aware Online Batch Selection for LLM Supervised Fine-tuning
by: Zou, Heming, et al.
Published: (2025)
by: Zou, Heming, et al.
Published: (2025)
Bayesian Fine-tuning in Projected Subspaces
by: Dubovik, Viktar, et al.
Published: (2026)
by: Dubovik, Viktar, et al.
Published: (2026)
HOFT: Householder Orthogonal Fine-tuning
by: Arcas, Alejandro Moreno, et al.
Published: (2025)
by: Arcas, Alejandro Moreno, et al.
Published: (2025)
Anytime-valid off-policy inference for contextual bandits
by: Waudby-Smith, Ian, et al.
Published: (2022)
by: Waudby-Smith, Ian, et al.
Published: (2022)
Fine-tuning Behavioral Cloning Policies with Preference-Based Reinforcement Learning
by: Macuglia, Maël, et al.
Published: (2025)
by: Macuglia, Maël, et al.
Published: (2025)
Enhancing Multi-modal Models with Heterogeneous MoE Adapters for Fine-tuning
by: Zhou, Sashuai, et al.
Published: (2025)
by: Zhou, Sashuai, et al.
Published: (2025)
MOTO: Offline Pre-training to Online Fine-tuning for Model-based Robot Learning
by: Rafailov, Rafael, et al.
Published: (2024)
by: Rafailov, Rafael, et al.
Published: (2024)
LLMs are Greedy Agents: Effects of RL Fine-tuning on Decision-Making Abilities
by: Schmied, Thomas, et al.
Published: (2025)
by: Schmied, Thomas, et al.
Published: (2025)
Human Implicit Preference-Based Policy Fine-tuning for Multi-Agent Reinforcement Learning in USV Swarm
by: Kim, Hyeonjun, et al.
Published: (2025)
by: Kim, Hyeonjun, et al.
Published: (2025)
An Efficient Rehearsal Scheme for Catastrophic Forgetting Mitigation during Multi-stage Fine-tuning
by: Bai, Andrew, et al.
Published: (2024)
by: Bai, Andrew, et al.
Published: (2024)
Open-Vocabulary Calibration for Fine-tuned CLIP
by: Wang, Shuoyuan, et al.
Published: (2024)
by: Wang, Shuoyuan, et al.
Published: (2024)
Probe-based Fine-tuning for Reducing Toxicity
by: Wehner, Jan, et al.
Published: (2025)
by: Wehner, Jan, et al.
Published: (2025)
Analyzing the Effect of Noise in LLM Fine-tuning
by: Li, Lingfang, et al.
Published: (2026)
by: Li, Lingfang, et al.
Published: (2026)
Adjoint Matching: Fine-tuning Flow and Diffusion Generative Models with Memoryless Stochastic Optimal Control
by: Domingo-Enrich, Carles, et al.
Published: (2024)
by: Domingo-Enrich, Carles, et al.
Published: (2024)
Improving Sampling Methods for Fine-tuning SentenceBERT in Text Streams
by: Garcia, Cristiano Mesquita, et al.
Published: (2024)
by: Garcia, Cristiano Mesquita, et al.
Published: (2024)
Can Muon Fine-tune Adam-Pretrained Models?
by: Qu, Xingyu, et al.
Published: (2026)
by: Qu, Xingyu, et al.
Published: (2026)
Efficient Adjoint Matching for Fine-tuning Diffusion Models
by: Shin, Jeongwoo, et al.
Published: (2026)
by: Shin, Jeongwoo, et al.
Published: (2026)
Grow, Don't Overwrite: Fine-tuning Without Forgetting
by: Adila, Dyah, et al.
Published: (2026)
by: Adila, Dyah, et al.
Published: (2026)
Fine-tuning Multi-hop Question Answering with Hierarchical Graph Network
by: Xiong, Guanming
Published: (2020)
by: Xiong, Guanming
Published: (2020)
Joint Fine-tuning and Conversion of Pretrained Speech and Language Models towards Linear Complexity
by: He, Mutian, et al.
Published: (2024)
by: He, Mutian, et al.
Published: (2024)
ReFine: Boosting Time Series Prediction of Extreme Events by Reweighting and Fine-tuning
by: Shi, Jimeng, et al.
Published: (2024)
by: Shi, Jimeng, et al.
Published: (2024)
Selective Pre-training for Private Fine-tuning
by: Yu, Da, et al.
Published: (2023)
by: Yu, Da, et al.
Published: (2023)
SMART Fine-tuning Factor Augmented Neural Lasso
by: Chai, Jinhang, et al.
Published: (2026)
by: Chai, Jinhang, et al.
Published: (2026)
FDPP: Fine-tune Diffusion Policy with Human Preference
by: Chen, Yuxin, et al.
Published: (2025)
by: Chen, Yuxin, et al.
Published: (2025)
A Study of Optimizations for Fine-tuning Large Language Models
by: Singh, Arjun, et al.
Published: (2024)
by: Singh, Arjun, et al.
Published: (2024)
Understanding Fine-tuning in Approximate Unlearning: A Theoretical Perspective
by: Ding, Meng, et al.
Published: (2024)
by: Ding, Meng, et al.
Published: (2024)
Model Balancing Helps Low-data Training and Fine-tuning
by: Liu, Zihang, et al.
Published: (2024)
by: Liu, Zihang, et al.
Published: (2024)
Functional-level Uncertainty Quantification for Calibrated Fine-tuning on LLMs
by: Niu, Ruijia, et al.
Published: (2024)
by: Niu, Ruijia, et al.
Published: (2024)
Similar Items
-
Flow-DPO: Improving LLM Mathematical Reasoning through Online Multi-Agent Learning
by: Deng, Yihe, et al.
Published: (2024) -
ReinFlow: Fine-tuning Flow Matching Policy with Online Reinforcement Learning
by: Zhang, Tonghe, et al.
Published: (2025) -
Active, anytime-valid risk controlling prediction sets
by: Xu, Ziyu, et al.
Published: (2024) -
Memento: Fine-tuning LLM Agents without Fine-tuning LLMs
by: Zhou, Huichi, et al.
Published: (2025) -
$π_\texttt{RL}$: Online RL Fine-tuning for Flow-based Vision-Language-Action Models
by: Chen, Kang, et al.
Published: (2025)